key: cord-267948-jveh2w09 authors: rossen, lauren m.; branum, amy m.; ahmad, farida b.; sutton, paul; anderson, robert n. title: excess deaths associated with covid-19, by age and race and ethnicity — united states, january 26–october 3, 2020 date: 2020-10-23 journal: mmwr morb mortal wkly rep doi: 10.15585/mmwr.mm6942e2 sha: doc_id: 267948 cord_uid: jveh2w09 as of october 15, 216,025 deaths from coronavirus disease 2019 (covid-19) have been reported in the united states*; however, this number might underestimate the total impact of the pandemic on mortality. measures of excess deaths have been used to estimate the impact of public health pandemics or disasters, particularly when there are questions about underascertainment of deaths directly attributable to a given event or cause (1-6).† excess deaths are defined as the number of persons who have died from all causes, in excess of the expected number of deaths for a given place and time. this report describes trends and demographic patterns in excess deaths during january 26-october 3, 2020. expected numbers of deaths were estimated using overdispersed poisson regression models with spline terms to account for seasonal patterns, using provisional mortality data from cdc's national vital statistics system (nvss) (7). weekly numbers of deaths by age group and race/ethnicity were assessed to examine the difference between the weekly number of deaths occurring in 2020 and the average number occurring in the same week during 2015-2019 and the percentage change in 2020. overall, an estimated 299,028 excess deaths have occurred in the united states from late january through october 3, 2020, with two thirds of these attributed to covid-19. the largest percentage increases were seen among adults aged 25-44 years and among hispanic or latino (hispanic) persons. these results provide information about the degree to which covid-19 deaths might be underascertained and inform efforts to prevent mortality directly or indirectly associated with the covid-19 pandemic, such as efforts to minimize disruptions to health care. determine the degree to which observed numbers of deaths differ from historical norms. in april 2020, cdc's national center for health statistics (nchs) began publishing data on excess deaths associated with the covid-19 pandemic (7, 8) . this report describes trends and demographic patterns in the number of excess deaths occurring in the united states from january 26, 2020, through october 3, 2020, and differences by age and race/ ethnicity using provisional mortality data from the nvss. § excess deaths are typically defined as the number of persons who have died from all causes, in excess of the expected number of deaths for a given place and time. a detailed description of the methodology for estimating excess deaths has been described previously (7) . briefly, expected numbers of deaths are estimated using overdispersed poisson regression models with spline terms to account for seasonal patterns. the average expected number, as well as the upper bound of the 95% prediction interval (the range of values likely to contain the value of a single new observation), are used as thresholds to determine the number of excess deaths (i.e., observed numbers above each threshold) and percentage excess (excess deaths divided by average expected number of deaths). estimates described here refer to the number or percentage above the average; estimates above the upper bound threshold have been published elsewhere (7) . observed numbers of deaths are weighted to account for incomplete reporting by jurisdictions (50 states and the district of columbia [dc]) in the most recent weeks, where the weights were estimated based on completeness of provisional data in the past year (7) . weekly nvss data on excess deaths occurring from january 26 (the week ending february 1), 2020, through october 3, 2020, were used to quantify the number of excess deaths and the percentage excess for deaths from all causes and deaths from all causes excluding covid-19. ¶ non-hispanic american indian or alaska native [ai/an], and other/unknown race/ethnicity, which included non-hispanic native hawaiian or other pacific islander, non-hispanic multiracial, and unknown) were used to examine the difference between the weekly number of deaths occurring in 2020 and the average number occurring in the same week during 2015-2019. these values were used to calculate an average percentage change in 2020 (i.e., above or below average compared with past years), over the period of analysis, by age group and race and hispanic ethnicity. nvss data in this report include all deaths occurring in the 50 states and dc and are not limited to u.s. residents. approximately 0.2% of decedents overall are foreign residents. r statistical software (version 3.5.0; the r foundation) was used to conduct all analyses. from january 26, 2020, through october 3, 2020, an estimated 299,028 more persons than expected have died in the united states.** excess deaths reached their highest points to date during the weeks ending april 11 (40.4% excess) and august 8, 2020 (23.5% excess) ( figure 1 ). two thirds of excess deaths during the analysis period (66.2%; 198,081) were attributed to covid-19 and the remaining third to other causes † † (figure 1 ). the total number of excess deaths (deaths above average levels) from january 26 through october 3 ranged from a low of approximately 841 in the youngest age group (<25 years) to a high of 94,646 among adults aged 75-84 years. § § however, the average percentage change in deaths over this period compared with previous years was largest for adults aged 25-44 years (26.5%) ( figure 2 ). overall, numbers of deaths among persons aged <25 years were 2.0% below average, ¶ ¶ and among adults ** excess deaths over this period ranged from 224,173 to 299,028. the lower end of this range corresponds to the total number above the upper bound of the 95% prediction intervals, and the upper end of the range corresponds to the total number above the average expected counts. deaths above the upper bound threshold are significantly higher than expected. https://www.cdc.gov/ nchs/nvss/vsrr/covid19/excess_deaths.htm. † † excess deaths attributed to covid-19 were calculated by subtracting the number of excess deaths from all causes excluding covid-19 from the total number of excess deaths from all causes. these excess death estimates were based on the numbers of deaths above the average expected number. using the upper bound of the 95% prediction interval for the expected numbers (the upper bound threshold), an estimated 224,173 excess deaths occurred during this period, 85.5% of which were attributed to covid-19. § § weeks when the observed numbers of deaths were below the average numbers from 2015 to 2019 were excluded from the total numbers of excess deaths above average levels (i.e., negative values were treated as 0 excess deaths). ¶ ¶ the total average percentage change in the number of deaths occurring from the week ending february 1, 2020, through october 3, 2020, included weeks where the percentage difference was negative (i.e., deaths were fewer than expected). this mainly affected the youngest age group, among whom, overall, deaths during this period were 2.0% below average. excluding weeks with negative numbers of excess deaths results in overall percentage increases of 4.2% for decedents aged <25 years. increases for other age groups were similar when excluding weeks with negative numbers of excess deaths, with the exception of those aged ≥85 years, among whom the percentage increase was larger (18.1%) when weeks with negative values were excluded. aged 45-64, 65-74 years, 75-84, and ≥85 years were 14.4%, 24.1%, 21.5%, and 14.7% above average, respectively. when examined by race and ethnicity, the total numbers of excess deaths during the analysis period ranged from a low of approximately 3,412 among ai/an persons to a high of 171,491 among white persons. for white persons, deaths were 11.9% higher when compared to average numbers during 2015-2019. however, some racial and ethnic subgroups experienced disproportionately higher percentage increases in deaths ( figure 3) . specifically, the average percentage increase over this period was largest for hispanic persons (53.6%). deaths were 28.9% above average for ai/an persons, 32.9% above average for black persons, 34.6% above average for those of other or unknown race or ethnicity, and 36.6% above average for asian persons. based on nvss data, excess deaths have occurred every week in the united states since march 2020. an estimated 299,028 more persons than expected have died since january 26, 2020; approximately two thirds of these deaths were attributed to covid-19. a recent analysis of excess deaths from march through july reported very similar findings, but that study did not include more recent data through september (5) . although more excess deaths have occurred among older age groups, relative to past years, adults aged 25-44 years have experienced the largest average percentage increase in the number of deaths from all causes from late january through october 3, 2020. the age distribution of covid-19 deaths shifted toward younger age groups from may through august (9); however, these disproportionate increases might also be related to underlying trends in other causes of death. future analyses might shed light on the extent to which increases among younger age groups are driven by covid-19 or by other causes of death. among racial and ethnic groups, the smallest average percentage increase in numbers of deaths compared with previous years occurred among white persons (11.9%) and the largest for hispanic persons (53.6%), with intermediate increases (28.9%-36.6%) among ai/an, black, and asian persons. these disproportionate increases among certain racial and ethnic groups are consistent with noted disparities in covid-19 mortality.*** the findings in this report are subject to at least five limitations. first, the weighting of provisional nvss mortality data might not fully account for reporting lags, particularly in recent weeks. estimated numbers of deaths in the most recent weeks are likely underestimated and will increase as more data become available. second, there is uncertainty associated with *** https://www.cdc.gov/coronavirus/2019-ncov/community/health-equity/ race-ethnicity.html. the models used to generate the expected numbers of deaths in a given week. a range of values for excess death estimates is provided elsewhere (7), but these ranges might not reflect all of the sources of uncertainty, such as the completeness of provisional data. third, different methods or models for estimating the expected numbers of deaths might lead to different results. estimates of the number or percentage of deaths above average levels by race/ethnicity and age reported here might not sum to the total numbers of excess deaths reported elsewhere, which might have been estimated using different methodologies. fourth, using the average numbers of deaths from past years might underestimate the total expected numbers because of population growth or aging, or because of increasing trends in certain causes such as drug overdose mortality. finally, estimates of excess deaths attributed to covid-19 might underestimate the actual number directly attributable to covid-19, because deaths from other causes might represent misclassified covid-19-related deaths or deaths indirectly caused by the pandemic. specifically, deaths from circulatory diseases, alzheimer disease and dementia, and respiratory diseases have increased in 2020 relative to past years (7) , and it is unclear to what extent these represent misclassified covid-19 deaths what is already known about this topic? as of october 15, 216,025 deaths from covid-19 have been reported in the united states; however, this might underestimate the total impact of the pandemic on mortality. what is added by this report? overall, an estimated 299,028 excess deaths occurred from late january through october 3, 2020, with 198,081 (66%) excess deaths attributed to covid-19. the largest percentage increases were seen among adults aged 25-44 years and among hispanic or latino persons. what are the implications for public health practice? these results inform efforts to prevent mortality directly or indirectly associated with the covid-19 pandemic, such as efforts to minimize disruptions to health care. directly or indirectly associated with the covid-19 pandemic and the elimination of health inequities. cdc continues to recommend the use of masks, frequent handwashing, and maintenance of social distancing to prevent covid-19. † † † corresponding author: lauren m. rossen, lrossen@cdc.gov. new york city department of health and mental hygiene (dohmh) covid-19 response team. preliminary estimate of excess mortality during the covid-19 outbreak differential and persistent risk of excess mortality from hurricane maria in puerto rico: a time-series analysis estimation of excess deaths associated with the covid-19 pandemic in the united states estimating the number of excess deaths attributable to heat in 297 excess deaths from covid-19 and other causes every body counts: measuring mortality from the covid-19 pandemic us department of health and human services, cdc, national center for health statistics us department of health and human services, cdc, national center for health statistics race, ethnicity, and age trends in persons who died from covid-19-united states 1 national center for health statistics, cdc.all authors have completed and submitted the international committee of medical journal editors form for disclosure of potential conflicts of interest. no potential conflicts of interest were disclosed. † † † https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-sick/ prevention.html. week of death week of death week of death week of death week of death week of death week of death or deaths indirectly related to the pandemic (e.g., because of disruptions in health care access or utilization). despite these limitations, however, this report demonstrates important trends and demographic patterns in excess deaths that occurred during the covid-19 pandemic. these results provide more information about deaths during the covid-19 pandemic and inform public health messaging and mitigation efforts focused on the prevention of infection and mortality key: cord-261437-x2k9apav authors: li, d.; croft, d. p.; ossip, d. j.; xie, z. title: are vapers more susceptible to covid-19 infection? date: 2020-05-09 journal: medrxiv : the preprint server for health sciences doi: 10.1101/2020.05.05.20092379 sha: doc_id: 261437 cord_uid: x2k9apav background covid-19, caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2), was declared a global pandemic in march 2020. electronic cigarette use (vaping) rapidly gained popularity in the us in recent years. whether electronic cigarette users (vapers) are more susceptible to covid-19 infection is unknown. methods using integrated data in each us state from the 2018 behavioral risk factor surveillance system (brfss), united states census bureau and the 1point3acres.com website, generalized estimating equation (gee) models with negative binomial distribution assumption and log link functions were used to examine the association of weighted proportions of vapers with number of covid-19 infections and deaths in the us. results the weighted proportion of vapers who used e-cigarettes every day or some days ranged from 2.86% to 6.42% for us states. statistically significant associations were observed between the weighted proportion of vapers and number of covid-19 infected cases as well as covid-19 deaths in the us after adjusting for the weighted proportion of smokers and other significant covariates in the gee models. with every one percent increase in weighted proportion of vapers in each state, the number of covid-19 infected cases increase by 0.3139 (95% ci: 0.0554 0.5723) and the number of covid-19 deaths increase by 0.3705 (95% ci: 0.0623 0.6786) in log scale in each us state. conclusions the positive associations between the proportion of vapers and the number of covid-19 infected cases and deaths in each us state suggest an increased susceptibility of vapers to covid-19 infections and deaths. novel coronavirus disease 2019 (covid19) outbreak was declared a global pandemic by the world health organization (who) on march 11, 2020. 1 as of april 28, 2020, there were over three million covid-19 infected cases and over 200,000 deaths globally. 2 in the united state, the total number of infected covid-19 cases exceeded one million, with over 57,000 deaths reported by april 28, 2020 . covid-19 infection presents with cough, dyspnea and fever among other systemic symptoms and can lead to pneumonia and acute hypoxemic respiratory failure. 3 electronic cigarettes (e-cigarettes), promoted as an alternative for cigarette smoking, rapidly gained popularity in recent years in the us. in 2018, the prevalence of current e-cigarette use (vaping) in us adults was 3.2%. 4 recent studies on the associations of vaping and health symptoms/diseases have observed associations between vaping and symptoms of wheezing and self-reported chronic obstructive pulmonary disease (copd), along with increased inflammation in bronchial epithelial cells and alterations in the pulmonary immune response to infection. [5] [6] [7] [8] [9] [10] tobacco control researchers have raised concerns that vapers may be more susceptible to covid-19 infections and could develop more severe covid-19 symptoms. 11 however, there is very limited evidence on the association between vaping and covid-19 infection. we will examine the association of vaping with covid-19 infections and deaths, using the integrated state-level weighted proportions of current e-cigarette users (vapers) from the 2018 behavioral risk factor surveillance system (brfss) survey data, the population size and land area in 2018 in each state from united states census bureau, and the daily number of covid-19 infected cases and deaths in each state from the 1point3acres.com website during the time period from january 21, 2020 to april 25, 2020 in the united states. our study is the first one to provide evidence on the association of vaping with covid-19 infections and deaths at the us population level. we integrated the 2018 behavioral risk factor surveillance system (brfss) survey data at state level, the population size and land area in each state from the united states census bureau, and the covid-19 infected cases and deaths data from the 1point3acres.com website at available dates from each state through the unique two letter state abbreviations. from the 2018 brfss survey, 34 states provided information on the vaping status variable. the population size in each state in 2018 and land area in each state were obtained from the united states census bureau website. the covid-19 infected cases and deaths counts were available for each state from january 21, 2020 to april 25, 2020. reports of negative numbers of infected cases and deaths were excluded from the covid-19 data. after integrating the brfss data and the census data with the covid-19 infected case and deaths from different dates at the state level, there were 1607 observations in the final analysis data. the vaping status variable was defined by the answers to the question "do you now use ecigarettes, every day, some days, or not at all?" in the 2018 brfss survey. subjects who now use e-cigarettes every day or some days were classified as vapers and subjects who responded all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 9, 2020. . https://doi.org/10.1101/2020.05.05.20092379 doi: medrxiv preprint that they use e-cigarettes "not at all" or "not applicable" were classified as non-vapers. the weighted frequency of vapers in each state was obtained using the proc surveyfreq procedure in sas version 9.4 (sas institute inc., cary, nc), considering the complex sampling design of the brfss survey. the weighted proportion of vapers in each us state was calculated using the ratio of weighted frequency of vapers and weighted frequency of total number of subjects in each state. the outcomes used in current analysis are the number of covid-19 infected cases and deaths. covariates considered in the current investigation include population size, population density (calculated using population size divided by land area), age, gender, race/ethnicity, education, income, mental health, physical health, obesity, respiratory disease (including asthma and copd), heart disease, cancer, stroke, diabetes, kidney disease, and smoking (currently smoke every day or some days). the number of covid-19 infected cases was also used as a covariate when modeling the covid-19 deaths. generalized estimating equation (gee) models with negative binomial distribution assumptions and log link functions were used to examine the association of weighted proportion of vapers with the number of covid-19 infections and deaths, after adjusting for the confounding effects from significant covariates. 12, 13 the correlations of number of covid-19 infections and deaths from different dates within the same state were considered through the autoregressive 1 (ar (1)) variance-covariance structure within the gee model framework. the purposeful covariates all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 9, 2020. . selection method was used to select significant covariates for the gee models. 14 variance inflation factor (vif) was used to examine the multicollinearities among the predictor variables in the gee models. 15 a vif value of five or less was considered to indicate multicollinearity in the fitted gee model. all statistical analyses were conducted using statistical analysis software sas version 9.4 (sas institute inc., cary, nc) and r (r core team, 2017). significance levels for all tests were set at 5% for two-sided tests. the weighted proportion of vapers ranged from 2.86% to 6.42% for us states. male gender, poor physical health, cancer, and obesity had a negative association with the daily all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 9, 2020. using integrated state level data obtained from the 2018 brfss survey, the united states census bureau and the 1point3acres.com website, we were able to investigate the association of weighted proportion of vapers with covid-19 infected cases and deaths in the united states. we found a significant positive association of vaping with covid-19 infections and deaths at the state level controlling for sociodemographic and health related covariates. this finding all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 9, 2020. thus, we cannot determine causality between vaping and covid-19 infections and deaths. however, prior research supports the biological plausibility of a relationship between vaping and an increased susceptibility to respiratory infection. 16 multiple mouse models have observed an increased severity in infection associated with vaping exposure related to dysregulation of lung epithelial cells and an impaired immune response to both viral 17 , and bacterial infection. 18 bacterial superinfection of viral illnesses like influenza and covid-19 is especially dangerous, as this leads to an increased severity in illness 19 a human cell based model of exposure to nicotine-free flavored e-liquid observed immunosuppressive effects and impaired respiratory innate immune cell function (alveolar macrophages, neutrophils, and natural killer cells). 20 in humans, bronchoalveolar lavage samples from the airways of active vapers also revealed dysregulation of the airway's innate immune response including neutrophilic response and mucin 21 a wide variety of flavorings are used by vapers, many of which, such as diacetyl, acetoin, pentanedione, o-vanillin, maltol, and coumarin in nicotine-free e-liquid, could also trigger inflammatory responses in human monocytes. 7 finally, previous human studies found vaping is associated with increased risk of chronic bronchitic symptoms (chronic cough, phlegm, or all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 9, 2020. . https://doi.org/10.1101/2020.05.05.20092379 doi: medrxiv preprint bronchitis). [22] [23] [24] and epidemiologic studies observed an increased risk of self-reported wheezing and copd. 9, 10 with the ongoing covid-19 pandemic, particular health concerns have been raised regarding vaping, such as whether vapers have higher risk for covid-19 infection and could develop more severe symptoms once contracted covid-19. 11 to our best knowledge, this is the first population-based study to empirically examine and find an association between vaping and covid-19. the existing literature on the increased risk of respiratory infection in combustible cigarette smokers was summarized by a prior meta-analysis finding an increased risk of current smokers for influenza infection compared to non-smokers. 25 as covid-19 is a novel condition, the literature examining the risk of smokers for covid-19 is scant. a recent study based on 1099 covid-19 patients found smoking history was associated with covid-19 severity. 26 a recent systematic review on covid-19 and smoking concludes that smoking is likely associated with worse outcomes in covid-19. 27 however, other studies indicate that smoking might not be associated with the incidence and severity of covid-19 28 , for example, a recent meta-analysis based on chinese patients suggests that active smoking is not associated with severity of covid-19. 29 it remains unclear whether nicotine has a role in the either the increased or decreased severity of illness for smokers with covid-19. our study did not find a significant association between the weighted proportion of smokers and the number of covid-19 infections and deaths at state level. due to the incomplete testing and tracking of home deaths, it is possible that a percentage of older smokers with comorbidities are dying at home from covid-19 and therefore are not captured into the reported covid-19 infections and death data 30, 31 another possibility is that smokers with comorbidities are homebound and more likely to be all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 9, 2020. . https://doi.org/10.1101/2020.05.05.20092379 doi: medrxiv preprint strictly following social/physical distancing guidelines, and are therefore reducing their risk for covid-19. currently, there is limited evidence on the susceptibility of smokers to covid-19 infection and whether smokers have a worsened course in the setting of covid-19. additionally, it is unknown whether or not covid-19 virus could be transmitted to those surrounding smokers through passive smoking and vaping. more epidemiological and clinical studies are needed to investigate the association of smoking with covid-19 infections and deaths. we found states that had a larger weighted proportion of subjects who had less than high school education had a higher number of covid-19 infections. researchers from the university of southern california found that americans who had less than high school education had a lower perceived risk of exposure to covid-19 and a higher perceived risk of deaths than those who have college or higher degrees. 32 this might explain, in part, the positive association between the weighted proportion of less than high school education with the number of covid-19 infections. we also found states that had a larger proportion of non-hispanic blacks and hispanics had a larger number of covid-19 deaths. the covid-19 deaths rate data from washington d.c. and 36 us states reported through april 27, 2020 showed that non-hispanic blacks (28.4%) and hispanics (11.3%) had the highest covid-19 deaths rate. 33 this could be related to the higher proportion of chronic conditions such as hypertension, heart disease and diabetes in non-hispanic blacks and hispanics. 34, 35 we also found that states having a higher proportion of respiratory disease such as asthma or copd had a higher number of covid-19 deaths, which indicated that respiratory disease such as asthma and copd could potentially increase the risk of covid-19 deaths. there are several limitations in current study. one limitation is that the weighted proportions of vapers, smokers, and other demographic and chronic diseases are from the 2018 brfss data, which might differ from the 2020 estimates. the reported covid-19 infected cases and deaths obtained from 1point3acres.com website could be subject to some reporting errors as we noticed some negative number of covid-19 infected cases and deaths, which we excluded from further analysis. however, we compared the covid-19 data obtained from 1point3acres. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 9, 2020. the content is solely the responsibility of the authors and does not necessarily represent the official views of the nih or the fda. the authors declare no competing interests. the 2018 behavioral risk factor surveillance system (brfss) survey data are publicly available from the centers for disease control and prevention website (https://www.cdc.gov/brfss/annual_data/annual_2018.html). the state population in 2018 and the land area in each state were obtained from the united states census bureau website all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 9, 2020. . https://doi.org/10.1101/2020.05.05.20092379 doi: medrxiv preprint (https://www.census.gov/). the covid-19 infected cases and deaths data were requested and obtained from the 1point3acres.com website (https://coronavirus.1point3acres.com/en). â�¢ current e-cigarette use is positively associated with covid-19 infections. â�¢ current e-cigarette use is positively associated with covid-19 deaths. â�¢ this study emphasizes the importance of studying the susceptibility of current e-cigarette users to covid-19 infection and death. who declares covid-19 a pandemic an interactive web-based dashboard to track covid-19 in real time current status of epidemiology, diagnosis, therapeutics, and vaccines for novel coronavirus disease 2019 (covid-19) prevalence of e-cigarette use among adults in the united states vaping away epithelial integrity electronic cigarette vapour increases virulence and inflammatory potential of respiratory pathogens e-cigarette flavored pods induce inflammation, epithelial barrier dysfunction, and dna damage in lung epithelial cells and monocytes electronic cigarette liquid increases inflammation and virus infection in primary human airway epithelial cells association of smoking and electronic cigarette use with wheezing and related respiratory symptoms in adults: cross-sectional results from the population assessment of tobacco and health (path) study, wave 2 use of electronic cigarettes and self-reported chronic obstructive pulmonary disease diagnosis in adults public health concerns and unsubstantiated claims at the intersection of vaping and covid-19 a generalized estimating equations approach to mixed-effects ordinal probit models modelling correlated data: multilevel models and generalized estimating equations and their use with data from research in developmental disabilities purposeful selection of variables in logistic regression applied linear-regression models -neter public health consequences of e-cigarettes electronic cigarettes disrupt lung lipid homeostasis and innate immunity independent of nicotine electronic cigarette inhalation alters innate immunity and airway cytokines while increasing the virulence of colonizing bacteria influenza and bacterial superinfection: illuminating the immunologic mechanisms of disease flavored e-cigarette liquids and cinnamaldehyde impair respiratory innate immune cell function e-cigarette use causes a unique innate immune response in the lung, involving increased neutrophilic activation and altered mucin secretion what are the respiratory effects of ecigarettes? electronic cigarette use and respiratory symptoms in chinese adolescents in hong kong electronic cigarette use and respiratory symptoms in adolescents cigarette smoking and the occurrence of influenza -systematic review clinical characteristics of coronavirus disease 2019 in china covid-19 and smoking: a systematic review of the evidence sex difference and smoking predisposition in patients with covid-19 active smoking is not associated with severity of coronavirus disease 2019 (covid-19) what is happening to non-covid deaths the association of smoking status with sars-cov-2 infection, hospitalisation and mortality from covid-19: a living rapid evidence review racial/ethnic disparities in prevalence, treatment, and control of hypertension among us adults following application of the 2017 american college of cardiology/american heart association guideline age-stratified prevalence, treatment status, and associated factors of hypertension among us adults following application of the 2017 acc/aha guideline key: cord-163587-zjnr7vwm authors: altmejd, adam; rocklov, joacim; wallin, jonas title: nowcasting covid-19 statistics reported withdelay: a case-study of sweden date: 2020-06-11 journal: nan doi: nan sha: doc_id: 163587 cord_uid: zjnr7vwm the new corona virus disease -covid-2019 -is rapidly spreading through the world. the availability of unbiased timely statistics of trends in disease events are a key to effective responses. but due to reporting delays, the most recently reported numbers are frequently underestimating of the total number of infections, hospitalizations and deaths creating an illusion of a downward trend. here we describe a statistical methodology for predicting true daily quantities and their uncertainty, estimated using historical reporting delays. the methodology takes into account the observed distribution pattern of the lag. it is derived from the removal method, a well-established estimation framework in the field of ecology. the new corona virus pandemic is affecting societies all around the world. as countries are challenged to control and fight back, they are in need of timely, unbiased, data for monitoring trends and making fast and well-informed decisions (nature, 2020) . official statistics are usually reported with long delay after thorough verification, but in the midst of a deadly pandemic, real time data is of critical importance for policymakers (jajosky and groseclose, 2004) . the latest data are often not finalized, but change as new information is reported. in fact, reporting delays make the most recent days have the least cases accounted for, producing a dangerous illusion of an always improving outlook. still, these unfinished statistics offer crucial information. if the pandemic is indeed slowing, we should not wait for the data to be finalized before using it. rather, we argue that actual case counts and deaths should be nowcasted to account for reporting delay, thus allowing policymakers to use the latest numbers availiable without beinig misled by reporting bias. such predictions provide an additional feature that is perhaps even more important. they explicitly model the uncertainty about these unknown quantities, ensuring that all users of these data have the same view of the current state of the epidemic. in this paper we describe a statistical methodology for nowcasting the epidemic statistics, such as hospitalizations or deaths, and their degrees of uncertainty, based on the daily reported event frequency and the observed distribution pattern of reporting delays. the prediction model is building on methods developed in ecology, referred to as the "removal method" (pollock, 1991) . to help motivate why such forecasting is needed, we now turn to the case of sweden. the model is flexible by design, however, and could easily be applied to other countries as well. the swedish public health agency updates the covid-19 statistics daily 1 . during a press conference, they present updates on the number of deaths, admissions to hospitals and intensive care, as well as case counts. one of the reasons for following these indicators is to enable public health professionals and the public to observe the evolving patterns of the epidemic (anderson et al., 2020) . in relation to policy, it is of specific interest to understand if the growth rates changes, which could indicate the need for a policy response. however, in each daily report only a proportion of the number of recent deaths is yet known, and this bias produces the illusion of a downward trend. the death counts suffer from the longest reporting delay. in their daily press conference, the swedish public health agency warns for this by stopping the reported 7-day moving average trend line 10 days before the latest date. but not only are deaths often reported far further back than 10 days, a bar plot still shows the latest information, creating a sense of a downward trend. in fact, this might be the reason why the number of daily deaths have been underestimated repeatedly. at the peak, deaths were initially believed to level out at around 60 per day, but after all cases had been reported more than two weeks later, the actual number was close to 120 (öhman and gagliano, 2020) . we propose to use the removal method, developed in animal management (pollock, 1991) , to present an estimate of the actual frequencies at a given day and their uncertainty. the method has a long history dating back at least to the 1930s (leslie and davis, 1939) . however, the first refined mathematical treatment of the method is credited to moran (1951) , more modern derivatives exits today (matechou et al., 2016) . it is a commonly applied method today when analyzing age cohorts in fishery and wildlife management. the removal method that has three major advantages over simply reporting moving averages: • it does not relay any previous trend in the data, • we can generate prediction intervals for the uncertainty about daily true frequencies, 1 the data is published on https://www.folkhalsomyndigheten. se/smittskydd-beredskap/utbrott/aktuella-utbrott/covid-19/ bekraftade-fall-i-sverige/. • the uncertainty estimates can be carried over to epidemiological models to help create more realistic models. a classic example where the method proposed to solve this problem has been used is in estimating statistics of trapping a closed population of animals (pollock, 1991) . each day the trapped animals are collected, and kept, and if there is no immigration the number of trapped animals the following days will, on average, decline. this pattern of declining number of trapped animals allows one to draw inference of the underlying population size. here we replace the animal population with the true number of deaths on a given day. instead of traps we have the new reports of covid-19 events. as the number of new reported deaths for a given day declines, we can draw inference on how many actually died that day. if we assume that the reporting structure is constant over time we can after a while quickly get good estimate of the actual number. suppose for example that on day one, 4 individuals are reported dead for that day. on the second day, 10 deaths are recorded for day two. then, with no further information, it is reasonable to assume that more people died on day two. if the proportion reported on the first day is 3%, the actual number of deaths would be 133 for day one and 333 for day two. if additionally, 60 deaths are reported during the second day to have happened during day one, and on the third day, only 40 are reported for day two, we now have conflicting information. from the first-day reports it seemed like more people had died during day two, but the second day-reports gave the opposite indication. the model we propose systematically deals with such data, and handles many other sources of systematic variation in reporting delay. in fact, the swedish reporting lag follows a calendar pattern. the number of events reported during weekends is much smaller. to account for this, we allow the estimated proportions of daily reported cases to follow a probability distribution taking into consideration what type of day it is. we propose a bayesian version of the removal model that assumes an overdispersed binomial distribution for the daily observations of deaths in sweden in covid-19. we then calculate the posterior distribution, prediction median and 95% prediction intervals of the expected deaths from the reported deaths on each specific day. the method and algorithm is thoroughly described in the supplementary information. to get accurate estimates we apply two institution-specific corrections. first, we only count workdays as constituting reporting delay, as very few deaths are reported during weekends. second, we apply a constant bias correction to account for the fact that swedish deaths come from two distinct populations with different trends: deaths in hospitals, and in elderly care. in figure 1 we apply the model to the latest statistics from sweden. the graph shows reported and predicted deaths (with uncertainty intervals) as bars, and a dashed line plots the 7-day (centered) moving average. a version without predictions is used in the public health agencyś daily press briefings. as expected, the model provides estimates of actual deaths considerably above the reported number of deaths. not how the model predicts additional deaths above the moving average line. to judge whether or not the model is accurate we need to compare it to a benchmark. the moving average of reported deaths is not useful, since it is biased for deaths that occurred within the last week. instead, we create a benchmark prediction by a normal distribution where the mean and standard deviation is taken from the historical lags from the last two weeks to the reported numbers 2 . figure 2 depicts four randomly chosen dates where the model is compared to the benchmark. the model and the benchmark are tasked with predicting the total number of individuals who have died at a given date and have been reported within 14 days of that date. as time progresses, more deaths are reported and the dashed grey line approaches the horizontal line. meanwhile model uncertainty decreases. figure 3 shows model performance compared to the benchmark for three difference performance metrics. all three graphs are based on predictions of reported deaths within 14 days, and show how performance increases as more data has been reported. each data point is the average of all dates where predictions can be evaluated. scrps is a measure of accuracy that rewards precision, it is a proper scoring rule like the continuous probability rank score or the brier score (see definition in appendix) (bolin and wallin, 2019 ). the central plot shows the width of the prediction intervals, and the rightmost one the proportion of piś that cover the true value. benchmark and model point estimates are similarly close to the truth. the model produces tighter prediction intervals. for 8-5 days of reporting lag (see figure 3 ), the intervals are too tight. this is likely because the public health agency queries the swedish death registry for covid-19 deaths only once or twice a week. since we do not know the process, it has not been explicitly modeled. the model proposed here can estimate the trends in surveillance data with reporting delays, such as the daily covid-19 reports in sweden. to generate accurate estimates of the actual event frequencies based on these reports is highly relevant and can have large implications for interpretations of the trends and evolution of disease outbreaks. in sweden, delays are considerable and exhibit a weekday and holiday pattern that need to be accounted for to draw conclusions from the data. the method and algorithm proposed overcomes major shortcomings in the daily interpretation and practice analyzing and controlling the novel corona virus pandemic. it also provides valuable measures of uncertainty around these estimates, showing users how large the range of possible outcomes can be. whenever case statistics are collected from multiple sources and attributed to its actual event date in the middle of a public health emergency, similar reporting delays to the ones in sweden will necessarily occur. the method described thus has implications and value beyond sweden, for any situation where nowcasts of disease event frequencies are of relevance to public health. nevertheless, the method also has its limitations. as presented, the model assumes that all deaths are reported in the same manner. given there exists many regions in sweden this is unlikely to be the case. for example, it is easy to see that the swedish region västra götland follows a different reporting structure than stockholm. building a model for each region separately would most likely give better results and make the assumptions more reasonable. unfortunately we do not currently have access to the high resolution data required to do so. moreover, deaths are reported from two distinct populations that seem to follow different trends. at the time of writing, the daily deaths in elderly care, reported with a longer delay, seem to be decreasing slower than hospital deaths. but statistics offer only aggregate numbers, prohibiting us from modeling two distinct processes. however, we have noted a clear decline in proportions of deaths reported the two first working days. for example the number of deaths occurring at the second of april ≈ 30% of deaths where reported within the first two working days whereas for the eighteens of may only ≈ 10% where reported during the two first working days. we address this by assuming that the deaths reported during the two first working days comes from a different population then the remainder of days. another limitation is that the model assumes that the number of new reported deaths for a given day cannot be negative, which is not actually true, due to miscount or misclassification of days. the number of such cases is very small, however, and its removal should not make much difference. the central assumption of the model is that the proportions deaths reported each day is fixed (up to the known covariates). if actual reporting standards change over time, the model will not be able to account for this. but reporting likely becomes faster as the crisis infrastructure improves. one can imagine that after a while the reporting improves, or is changed, if this is not accounted for by a covariate in the model, it will report incorrect numbers. of course, there might be unknown variables that we have failed to incorporate, but at the least the model is an improvement from the estimates using moving averages. when the covariates to the reporting delay pattern are known, the model can incorporate them and provide more accurate predictions. in this paper, we provide a method to accurately nowcast daily covid-19 statistics that are reported with delay. by systematically modelling the delay, policy makers can avoid dangerous illusory downward trends. our model also gives precise uncertainty intervals, making sure users of these statistics are aware of the fast-paced changes that are possible during this pandemic. death date r 11 r 12 · · · · · · r 1t r 22 · · · · · · r 2t r 33 · · · r 3t . . . . . . p ij , i.e. typically in removal sampling one would set the probability of reporting uniform, i.e. p i,j := p. however for this data this is clearly not realistic given weekly patterns in reporting -very little reporting during the weekends. instead we assume that we have k different probabilities. further, to account for overdispertion, we assume that each probability rather being a fixed scalar is a random variable with a beta distribution. the beta distribution has two parameters α and β. this resulting the following distribution for the probabilities ). here, if j ∈ h then day j is a holidays or weekends, and the parameters above are else. these extra parameters are created to account for the under-reporting that occurs during weekend and holidays. finally we add an extra mixture component that allows for very low reporting. for the α and β parameters we use an (improper) uniform prior. for the deaths, d, one could imagine several different prior ideally some sort of epidemiological model. however, here we just assume a log-gaussian cox processes (møller et al., 1998) , but instead of poisson distribution we use a negative binomial to handle possible over dispersion. the latent gaussian processes has a intrinsic random walk distribution (rue and held, 2005) i.e. this model is created to create a temporal smoothing between the reported deaths. for the hyperparameter σ 2 we impose a inverse gamma distribution, this prior is suitable here because it guarantees that the process is not constant (σ 2 = 0) which we know is not the case. putting the likelihood and priors together we get the following hierarchical bayesian model where where and j ≤ i and i = 1, . . . , t . as the main goal to generate inference of the number of death d is through the posterior distribution of number of deaths d given the observations r. in order to generate samples from this distribution we use a markov chain monte carlo method (brooks et al., 2011) . in more detail we use a blocked gibbs sampler, which generates samples in the following sequence: • we sample α, β, α h , β h |d, r using the fact that one can integrate out p in the model, and then d|α, β, α h , β h , r, λ follows a beta-binomial distribution. here to we use an adaptive mala (atchadé, 2006) to sample from these parameters. • to sample d|α, β, α h , β h , r, λ, that each death, d i is conditionally independent, and we just use a metropolis hastings random walk to sample each one. • to sample λ|d, σ 2 we again use an adaptive mala. • finally we sample σ 2 |d,and p 0 , π directly since this distribution is explicit, and φ using a mh-rw. in this section, we present additional comparison of the model to the benchmark. we first describe the benchmark model in detail. the benchmark model simply takes the sum of average historical reporting lags for the preceding 14 days. as before r ij is the number of deaths that happened on day i and were recorded on day j. to predict the number of people that died on a given day, we first calculate lag averages: wherer i,i+l is the average number of deaths reported with a lag of l days, based on the 14 reports closest preceding day i. if we are looking at data released 2020 − 04 − 28 and call this day 0, the latest death date that we have 10-day (l = 10) reporting lag observation for is r −10,0 . the average for lag(0, 10) is therefore taken over the 14 days between r −24,−14 and r −10,0 (2020-04-04 and 2020-04-18). for this reason, some of the earlier predictions will not have data from 14 days. the average is then taken over all available reports. in the comparisons we aim at predicting the total number of deaths that will have been reported within 14 days of the death date. to do so, we sum over the average lag that has yet to be reported. if we are predicting the number of people that have yet to be reported dead for day -3, we already know the true values for r −3,−3 , r −3,−2 , r −3,−1 , and r −3,0 so we only need to predict r −3,1 . . . r −3,10 . the prediction is then benchmark(i, j) = j l=i r i,l + 14 l=jr i,l . (2) as confidence interval we simply use a normal assumption with standard deviations of the reporting lags, assuming independence, i.e. this is just the square root of the sum of v ar(r). how will country-based mitigation measures influence the course of the covid-19 epidemic? an adaptive version for the metropolis adjusted langevin algorithm with a truncated drift scale dependence: why the average crps often is inappropriate for ranking probabilistic forecasts handbook of markov chain monte carlo evaluation of reporting timeliness of public health surveillance systems for infectious diseases an attempt to determine the absolute number of rats on a given area open models for removal data log gaussian cox processes a mathematical theory of animal trapping coronavirus: three things all governments and their science advisers must do now antalet virusdöda har underskattats review papers: modeling capture, recapture, and removal statistics for estimation of demographic parameters for fish and wildlife populations: past, present, and future gaussian markov random fields: theory and applications a appendix before presenting the model we describe some notation used through out the appendix. for a m × n matrix r we use the following broadcasting notation r k,j:l = [r k,j , r k,j+1 , . . . , r k,l ]. further x|y ∼ π(.) implies that the random variable x if we conditioning on y follows distribution π(.). the relevant variables in the model are the following:variable name dimension descriptionlatent prior parameter for p α h 2 × 1 parameter for the probability, p for holiday adjustment. β h 2 × 1 parameter for the probability, p for holiday adjustment. µ t × 1 µ i is the intensity of the expected number of deaths at day i. σ 2 1 × 1 variation of the random walk prior for the log intensity. φ 1 × 1 overdispersion parameter for negative binomial distribution. p 0 1 × 1 probability of reporting for a low reporting event. pi 1 × 1 probability of a low reporting event. the most complex part of our model is the likelihood, i.e. the density of the observations given the parameters. here the data consist the daily report of recorded deaths for the past days. this can conveniently be represented upper triangular matrix, r, where r i,j represents number of new reported deaths for day i reported at day j. this matrix is displayed on the left in table 1 . we assume that given the true number of deaths at day i, d i , that each reported day j the remaining death d i − j−1 k=1 r i,k each recored with probability key: cord-021399-gs3i7wbe authors: dada, m.a.; lazarus, n.g. title: sudden natural death | infectious diseases date: 2005-11-18 journal: encyclopedia of forensic and legal medicine doi: 10.1016/b0-12-369399-3/00357-8 sha: doc_id: 21399 cord_uid: gs3i7wbe nan a wide range of deaths from natural causes is encountered in the field of forensic medicine. despite the advances in the diagnosis and treatment of infectious diseases, a substantial number of sudden and unexpected deaths are caused by infections. in most medicolegal systems these deaths are subject to a forensic investigation. the world health organization defines sudden death as that occurring within 24 h of the onset of symptoms. some authors variably define sudden death as that occurring within 1, 6, and 12 h of the onset of symptoms. forensic pathologists should be aware of the importance of infectious causes of sudden death in the present era of bioterrorism and emergent and reemergent diseases. genetic engineering has led to the development of highly infectious and virulent strains of microorganisms (e.g., anthrax). emerging infectious diseases are infections whose incidence has increased in recent years and/or threatens to increase in the near future. reemergence refers to the reappearance of a known infection after a period of disappearance or decline. death from infectious agents may occur as a direct consequence of the infection or from complications such as immunosuppression caused by the infection and adverse reactions to therapeutic drugs. sudden death due to infectious disease may be classified by organ system involvement (e.g., cardiac -myocarditis; nervous system -meningitis and encephalitis) or according to the etiological agent (e.g., viral, chlamydial, bacterial, fungal, protozoal, or helminthic) . the common infectious causes of sudden death by organ system are listed in table 1 . the morphological findings at autopsy will depend on the type of organism, the site involved, and the host's response to the organism. microbiological demonstration of an organism does not equate to disease, as a host may be colonized by bacteria or the patient may have an asymptomatic viral infection. the exquisite sensitivity of molecular tests, e.g., polymerase chain reaction, may exacerbate this problem if the results are not correlated with the pathological findings at autopsy. categories of human pathogens include prions; viruses; chlamydiae, rickettsiae, and mycoplasmas; bacteria; fungi; protozoans; and helminths. infection by prions, rickettsiae, and mycoplasmas is not normally associated with sudden and unexpected death. viruses are ubiquitous and cause a spectrum of disease in humans. these may range from asymptomatic infection, severe debilitating illness, to sudden death. viral infections causing sudden death usually involve the cardiac, respiratory, or the central nervous system. morphologic findings in viral infections may include intranuclear and/or intracytoplasmic inclusions, multinucleate giant cells, and tissue necrosis (cytopathic effect). in many cases the diagnosis can only be made on special investigations, e.g., culture, electron microscopy, serology, or molecular testing. viral hemorrhagic fevers such as marburg, lassa, and ebola virus may cause sudden death in children. if there is any suspicion of a viral hemorrhagic fever, special care must be taken to avoid unwarranted exposure to health workers. the local public health officials must be informed and consideration given to limited autopsy examination in consultation with a virologist (e.g., postmortem blood sampling and liver biopsy). cardiac involvement usually takes the form of myocarditis. although many viruses may cause myocarditis (table 2) , coxsackie a and b are responsible for most cases. fulminant coxsackievirus infection may also cause leptomeningitis, florid interstitial pneumonitis, pancreatitis, and focal hepatic necrosis. coxsackie b viruses should also be considered as a cause of sudden infant death. at autopsy, the myocardium is usually mottled and flabby. histology reveals focal infiltrates of inflammatory cells (neutrophils and/or lymphocytes, plasma cells, and macrophages). at least two foci of individual myofiber necrosis associated with 5-10 inflammatory cells are required for the histological diagnosis of myocarditis. focal aggregates of lymphocytes not associated with necrosis may be seen in elderly patients and are not diagnostic of myocarditis. myocardial involvement may be patchy. for adequate histological sampling, it is recommended that at least six sections be taken from various areas of the myocardium, including the left ventricle and nodal tissue. indirect damage to the myocardium may occur as an allergic response to a viral infection and eosinophilia, e.g., in eosinophilic myocarditis. this is a rare cause of sudden death in apparently healthy children due to the cardiac toxicity of eosinophils. studies have shown that persons undergoing severe mental or physical stress may have reduced immunity to viral infections. in the investigation of sudden death in athletes, the diagnosis of viral myocarditis must be considered. enteroviral infection may also play an important role in coronary plaque instability and may precipitate coronary thrombosis, leading to ventricular tachyarrhythmias and sudden death. viral infections of the respiratory system sudden death due to viral involvement of the respiratory system may be due to fulminant viral pneumonitis or bacterial pneumonia complicating an initial viral pneumonitis. viruses implicated include respiratory syncytial virus, human herpesvirus-6, and parainfluenza virus in children, and adenovirus and influenza a and b in adults. microscopically, the findings of a viral pneumonitis are usually nonspecific and include edema and widening of the interstitial septa with a mononuclear cell infiltrate. in some cases, diagnostic viral inclusions may be demonstrated. emergent diseases such as severe acute respiratory syndrome (sars) have a high mortality and may cause death within hours. sars refers to an acute respiratory illness caused by infection with a novel coronavirus currently known as the sars virus. postmortem histopathological evaluations of lung tissue show diffuse alveolar damage consistent with the pathologic manifestations of acute respiratory distress syndrome. there is usually mild interstitial inflammation with scattered alveolar pneumocytes showing cytomegaly, and enlarged nuclei with prominent nucleoli. when faced with the finding of diffuse alveolar damage at autopsy, the pathologist should consider other infective causes such as influenza, para influenza, respiratory syncytial, and adenoviruses, chlamydia, mycoplasma, pneumococcus, legionella, and pneumocystis. sudden death may occur due to direct infection of the nervous system or a complication of a viral infection such as toxoplasmosis in human immunodeficiency virus/acquired immunodeficiency syndrome (hiv/aids). herpes simplex virus-1 encephalitis is usually due to reactivation of latent infection. commonly affected sites include the temporal lobe(s) (medial before lateral), the inferior frontal lobe(s), and the sylvian cortex(es). at autopsy there is widespread and asymmetrical necrosis. in fulminant cases there is prominent hemorrhage and swelling with raised intracranial pressure and brain herniation. histological findings include perivascular cuffing by mononuclear cells (figure 1 ) and, in a small number of cases, intranuclear inclusions may be seen in astrocytes and neurons. in adult hiv infections, sudden death from infective causes may be due to opportunistic infections (e.g., toxoplasmosis) or rupture of mycotic aneurysms. in viral central nervous system infections the brain may appear macroscopically normal, especially in very young, elderly, debilitated, and immunocompromised individuals. specimens should be taken for microbiology and histology. serum and cerebrospinal fluid (csf) should be sent for antibody studies. tissue for histological examination should be taken from normal, obviously abnormal, and transition areas. routine sections should be taken from the cerebral cortex (all four lobes), thalamus, basal ganglia, hippocampus, brainstem, and cerebellum. as poliomyelitis has been described as a cause of sudden death in infants, autopsy protocols in sudden death should include histological examination of spinal cord and dorsal root ganglia. chlamydia pneumoniae may be associated with myocarditis and sudden unexpected death. bacterial infections are responsible for sudden unexpected death in adults and children. in the pediatric population bacterial infections of the respiratory, gastrointestinal, and central nervous system account for the majority of cases of sudden death. bacterial infections of the cardiovascular system bacterial causes of myocarditis include corynebacterium diphtheriae, neisseria meningitidis, and borrelia burgdorferi. in b. burgdorferi, cardiac involvement occurs in 1-8% of cases and death may occur as a result of conduction disturbances. in diphtheritic myocarditis myocardial damage is caused by the release of toxins. bartonella-induced silent myocarditis has been described as a cause of sudden unexpected cardiac death in athletes. granulomatous myocarditis may also lead to sudden death ( table 3 ). the mechanism of death includes arrhythmias, cardiac rupture, coronary occlusion, obstruction to pulmonary blood flow leading to fatal hemorrhage, and impaired myocardial contractility. cardiac tuberculosis is usually an autopsy diagnosis. histological examination of the myocardium shows a nodular, miliary, or diffuse infiltrative pattern. the coronary arteries may show narrowing or complete occlusion due to an intimal or diffuse tuberculous arteritis. it is uncommon to demonstrate acid-fast bacilli within the lesions. molecular tests such as the ligase chain reaction (lcr) and polymerase chain reaction (pcr) may be used to demonstrate the organism. sudden death in infective endocarditis occurs as a result of perforation of a free-wall myocardial abscess or rupture of a valve leaflet. staphylococcus aureus is responsible for 10-20% of cases and is the major cause in intravenous drug abusers. other bacterial causes include haemophilus, actinobacillus, cardiobacterium, eikenella, and kingella (hacek group). negative bacterial cultures may be found in 10% of cases as a result of prior antibiotic therapy. the most common sites of infection are the aortic and mitral valves, except in intravenous drug abusers, where the right-sided valves are primarily affected. tertiary syphilis causing aortitis may cause sudden death from rupture of aortic aneurysms with aortic dissection. the mechanism of death is either blood loss with hypovolemic shock or a fatal cardiac tamponade from intrapericardial rupture. bacterial infections of the respiratory system sudden death from acute epiglottitis occurs from respiratory obstruction caused by swelling of the epiglottic folds, uvula, and vocal cords. the most common cause of acute epiglottitis in developing countries is haemophilus influenzae type b. in countries with established immunization programs, the incidence of h. influenzae epiglottitis has decreased and other bacteria, such as streptococcus, staphylococcus, and pneumococcus, have been implicated as possible causes. postmortem blood cultures are positive in 50-75% of cases. lobar pneumonia ( figure 2 ) and confluent bronchopneumonia are the most frequent cause of sudden death from acute pulmonary disease. some 90-95% of lobar pneumonia is due to streptococcus pneumoniae type 3. bronchopneumonia is caused by staphylococci, streptococci, h. influenzae, pseudomonas aeruginosa, and coliform bacteria. pulmonary tuberculosis may result in hemoptysis, which can cause hypovolemic shock and sudden death. histologically, caseating granulomas are found. acid-fast bacilli are demonstrated using the ziehl-neelsen stain (figure 3) . corynebacterium diphtheriae produces a gray pseudomembrane from the pharynx to the larynx, and this may lead to respiratory obstruction and sudden death. legionnaire's disease is associated with outbreaks of sudden death. the disease is caused by legionella pneumophila, a facultative intracellular organism. it causes severe pneumonia in the elderly, in smokers, and in immunocompromised patients. the organisms may be transmitted via droplet spread from contaminated air-conditioning units and water coolers. the organism may be demonstrated by a modified silver stain (dieterle stain) or by immunofluorescence and culture. pyogenic meningitis may cause sudden death. the causative organism varies according to the age of the patient ( table 4) . the location of the exudates depends on the organism. in h. influenzae it is basally located. in pneumococcal meningitis it occurs over the convexities of the brain in the parasagittal region ( figure 4) . microscopic examination reveals neutrophils filling the subarachnoid space with extension of the inflammation into the leptomeningeal veins in fulminant cases. blood spread is the most common means of entry; however other routes of infection include local extension of infection, e.g., paranasal sinusitis, osteomyelitis, direct implantation, and via the peripheral nervous system. diffuse bacterial meningitis may follow rupture of a brain abscess, which may lead to sudden death. the organisms may be demonstrated by microbiological culture of the csf and examination of gram stains of the csf and brain tissue. bacterial urogenital tract infections fulminant acute bacterial pyelonephritis may lead to septicemia, causing sudden death. at autopsy, the kidneys show tubular necrosis with interstitial suppurative inflammation. renal papillary necrosis may also be present. severe bacterial enterocolitis may lead to sudden death, especially in the young. the pathogenesis of the diarrhea depends on the cause. vibrio cholerae and clostridium perfringens cause diarrhea by ingestion of a preformed toxin that is present in contaminated foods. enteroinvasive organisms such as salmonella, shigella, and enteroinvasive escherichia coli invade and destroy mucosal epithelial cells. death occurs as a result of dehydration and electrolyte imbalance. bleeding peptic ulcers that are caused by helicobacter pylori may be the first indication of an ulcer and account for 25% of ulcer deaths, many of which are sudden and unexpected. fulminant bacterial peritonitis secondary to acute appendicitis, acute salpingitis, ruptured peptic ulcer, diverticulitis, strangulated bowel, and cholecystitis may cause sudden death. primary peritonitis may occur postsplenectomy and in patients with splenic hypoplasia. patients with sickle-cell disease may have anatomical or functional asplenia. the former is due to repeated bouts of infarction leading to autosplenectomy. the latter is due to a defect in opsonization of encapsulated bacteria. massive bilateral adrenal hemorrhage with adrenocortical insufficiency may occur as a result of septicemic shock from overwhelming bacterial infection (waterhouse-friderichsen syndrome). the most common association is with neisseria meningitidis septicemia; however, other virulent organisms, e.g., h. influenzae and pseudomonas species, may also lead to this syndrome. sudden death due to fungal infection may occur in an immunocompromised host such as in hiv/aids. organisms include cryptococcus (meningitis or disseminated disease) and pneumocystis carinii (pneumonia). intravenous drug abusers are susceptible to endocarditis due to fungi such as candida. these patients are prone to fungal thromboembolism, leading to sudden death. sudden death may also be due to a complication of fungal diseases such as fatal subarachnoid hemorrhage complicating actinomycotic meningitis or fatal hemoptysis complicating pulmonary mucormycosis. diagnostic modalities include culture of the organism and the histological demonstration of the organisms in tissue. this may be facilitated by special stains such as the periodic acid-schiff (pas) or grocott's methenamine silver stain. fatal cardiac tamponade may occur with intrapericardial rupture of an amebic liver abscess due to entamoeba histolytica. fatal amebic meningoencephalitis may be caused by naegleria fowleri. the organism enters the arachnoid space through the cribriform plate of the nose. there is meningeal hemorrhage with fibrinoid necrosis of blood vessels. cerebral malaria does not usually cause sudden death. however, it may be the primary cause of sudden death in nonimmune persons. susceptible individuals are tourists, business travelers, and sailors. at autopsy, the brain is swollen and may have a ''slate gray'' color due to the brown-black malarial pigment called hemozoin. histology reveals petechial hemorrhages as well as intravascular parasitized red cells. small perivascular inflammatory foci called malarial or dã¼ rck's granulomas may be present. sudden death in malaria may also be due to rupture of an enlarged spleen. an enlarged spleen is fragile and more vulnerable to rupture. other infections that may lead to splenic rupture and sudden death are infectious mononucleosis and typhoid. sudden death due to cardiac involvement in chagas disease (trypanosoma cruzi) occurs in 5-10% of acute cases. the damage to the myocardium causes fatal ventricular tachycardia. histological examination shows myofiber necrosis with an acute inflammatory reaction. clusters of organisms may be found within dilated myofibers, resulting in intracellular pseudocysts. clinically occult helminthic diseases such as hydatid disease (echinococcus granulosus) and neurocysticercosis (taenia solium) may cause sudden death. in neurocysticercosis death may occur due to epilepsy or raised intracranial pressure. parasitic cysts containing scolices are present, especially in the subarachnoid space, cortical sulci, and cortical gray matter. large multilocular cysts (racemose cysts) may be present in the basilar cisterns near the cerebellopontine angle ( figure 5) . isolated cardiac hydatid cyst is an uncommon manifestation and accounts for fewer than 3% of all hydatid disease. sudden death may be the initial manifestation of the disease. death may be due to involvement of the left ventricular myocardium or to massive pulmonary embolism. all autopsies must be approached using universal precautionary principles. in sudden deaths complete autopsy examination is recommended with appropriate tissue and body fluid sampling for special investigations. autopsy sampling for microbiological investigations is indicated in the following circumstances: sudden unexpected deaths in children and adults, deaths in immunocompromised patients, deaths in patients with clinically suspected infections, and deaths with organ changes of infection. the problems encountered with autopsy microbiological testing are contamination during procurement of the sample because of poor technique or due to the postmortem spread of commensals. to prevent false-positive postmortem blood cultures the following should be observed: the body should be refrigerated as soon as possible; and movement of the body should be limited to decrease the possibility of postmortem bacterial spread. an aseptic technique should be used to collect the sample, which should be stored and transported in the correct medium and temperature. close liaison with the microbiology and virology laboratories is important to guide collection, preservation, transport, and evaluation of specimens. this is particularly important in cases where there are positive cultures with negative histological findings. sampling at multiple sites and determining the antibiotic sensitivities may be helpful in determining the significance of positive cultures. the finding of a ''pure'' as opposed to ''mixed'' culture helps to determine the significance of the findings. the type of organism in relation to the site where it was cultured also helps to differentiate contaminants from significant positive cultures. relevant special techniques should be used by the pathologist in order to improve the diagnostic yield in infectious diseases ( table 5) . in a small group of cases (so-called negative autopsies) no obvious cause of death is apparent after detailed initial external and internal examination. the incidence of negative autopsies is 5%-10%; this figure improves to about 5% when special tests such as postmortem chemistry and microbiology are carried out. infectious agents are not a common cause of sudden death. even in cases with little or no morphological changes, investigation of appropriate autopsy samples by recently developed laboratory techniques may prove invaluable and shed light on the cause of death. children: sudden natural infant and childhood death; sudden natural death: cardiovascular; central nervous system and miscellaneous causes in the year 2000 an estimated 815 000 people died from suicide around the world. this represents an annual global mortality rate of 14.5 per 100 000 population. according to the world health organization (who), suicide is the 13th leading cause of death worldwide. it leads among violent causes of death (e.g., suicide, homicide, traffic deaths). among those aged between 15 and 44 years, suicide is the fourth leading cause of death, and violence against the self is the sixth leading cause of disability. suicidal behavior ranges in degree from merely thinking about ending one's life, through developing a plan to commit suicide and obtaining the means to do so, attempting to kill oneself, to finally carrying out the act of ''completed suicide''. the term ''suicide'' is based on the latin words sui (of oneself) and caedere (to kill). the encyclopaedia britannica defines suicide as: ''the human act of self-inflicting one's own life cessation.'' however, it is often difficult to reconstruct the thoughts of people who commit suicide unless they have made clear statements before their death since all suicidal deaths are not clearly planned. in many legal systems, a death is certified as suicide if murder, accidental death, and natural causes can all be ruled out and if the circumstances are consistent with suicide. this article deals with fatal suicidal behavior. this is the term proposed for suicidal acts that result in death and that directly concern forensic medicine; it does not cover nonfatal suicidal behavior, attempted suicide, or deliberate self-harm, i.e., suicidal actions that do not result in death and which may be referred to psychiatrists. even if it is not always clearly planned, suicide is a result of an act deliberately initiated and performed by a person in expectation of its fatal outcome. suicide is also now a major public health problem, as evidenced by epidemiologic data. according to who, taken as an average for 53 countries for which complete data are available, the age-standardized suicide rate for 2000 was 14.5 per 100 000. the rate for males was 22.9 per 100 000 and for females 6.8 per 100 000. the rate of suicide is almost universally higher among men compared to women by an aggregate ratio of 3.5 to 1. for some countries the most recent data are shown in table 1 . over nearly 30 years , for 39 countries for which complete data are available, the suicide rates seem to have remained quite stable. geographically, changes in suicide rates vary considerably. according to the french national institute on demographic studies (ined; institut national des etudes dã©mographiques), which provides reliable information on suicide mortality, the rates range from 40.1 per 100 000 in the russian federation to 31.6 per 100 000 in hungary, 25.1 per 100 000 in japan, an introduction to neuropathology forensic pathology introduction to medico-legal practice general pathology of infectious diseases demonstration of infectious agents in tissue key: cord-230345-bu6vi7xz authors: bayes, cristian; rosas, victor sal y; valdivieso, luis title: modelling death rates due to covid-19: a bayesian approach date: 2020-04-06 journal: nan doi: nan sha: doc_id: 230345 cord_uid: bu6vi7xz objective: to estimate the number of deaths in peru due to covid-19. design: with a priori information obtained from the daily number of deaths due to codiv-19 in china and data from the peruvian authorities, we constructed a predictive bayesian non-linear model for the number of deaths in peru. exposure: covid-19. outcome: number of deaths. results: assuming an intervention level similar to the one implemented in china, the total number of deaths in peru is expected to be 612 (95%ci: 604.3 833.7) persons. sixty four days after the first reported death, the 99% of expected deaths will be observed. the inflexion point in the number of deaths is estimated to be around day 26 (95%ci: 25.1 26.8) after the first reported death. conclusion: these estimates can help authorities to monitor the epidemic and implement strategies in order to manage the covid-19 pandemic. there is a trend to forecast covid-19 using mathematical models for the probability of moving between states from susceptible to infected, and then to a recovered state or death (sir models) . this approach is however very sensitive to starting assumptions and tend to overestimated the virus reproductive rate. one key point that these models miss is the individual behavioral responses and government-mandated policies that can dramatically influence the course of the epidemic. in wuhan, for instance, strict social distancing was instituted on january 23rd, 2020, and by march 15th new infections were close to zero. taking into account this observation, covid et al. (2020) have proposed a statistical approach to model a empirical cumulative population death rate. however, modelling observed cumulative death numbers has the inherent problem of yielding a highly correlated data which, if it is not taken into account, could cause misleading inference results. instead of considering a mathematical model such as sir, that rely heavily on parameters assumptions, we will directly work as in covid et al. (2020) or (zhou et al., 2020) with empirical data. to this end, we propose to model the daily number of deaths using a poisson distribution with a rate parameter that is proportional to a skew normal density. using a bayesian approach and a prior epidemic china covid-19 history, we forecast the total number of deaths in peru for the next seventy days. let y t 1 , y t 2 , . . . , y tn be the number of covid-19 deaths at times t 1 , t 2 , . . . , t n , where time is measured from the first reported death due to covid-19, and let us suppose that the death rate will hit a platoon due to the government intervention. we propose then the model 1 y (t i ) ∼ p oisson(λ(t i )), i = 1, 2, . . . , n (1) with death rate where g(t i | α, β, η) denotes the density function of a skew normal distribution with location, scale, and shape parameters, α, β, and η, respectively; p is a maximum asymptotic level parameter and k is the population size. the choice of the skew normal distribution is motivated by its flexibility on the tails and their asymmetry, which can force, as one could expect, a rapidly increase rate at the first stages of the pandemic and a slower decrease of this rate at the last stages. a negative binomial distribution can also be considered for the deaths numbers, but we found empirically a better fit with the poisson model. our model differs of the approach taken by covid et al. (2020) , who directly models the cumulative death rate λ(t) = t 0 λ(s)ds with a term proportional to a symmetric cumulative normal distribution. this difference is not only found in the formulation, but also in the estimation procedure. while these authors incorporated first the chinese data-more concretely the time from when the initial death rate exceeds 1e-15 to the implementation of social distancing-into their model throughout a location-specific inflection point parameter or a maximum death rate and then used a sort of credibility model between short-range and long-rate variants, we followed a bayesian approach that incorporates the chinese data as a prior distribution. the posterior predictive distribution, which is the goal of this model, is then a dynamic object that can be updated with new data about the number of deaths or any other reliable information that may be incorporated as covariates in the model. apart from the total number of deaths, three main quantities of interest can be easily derived from our model. first, a time to threshold death rate, which provides the time after which only 0.01 % of deaths will be observed on the population. this will be defined as the 0.99 quantile of the g distribution. another characteristic of interest is the inflection point, defined as the time at which the death rate reaches its maximum level. since china was the first country to have experienced a drastic drop in infections and deaths, we are proposing to incorporate this data, into our peruvian death rate predictions, through a prior distribution. figure 1 shows the empirical distribution of number of reported deaths in china. one can notice that the reported numbers on february 13 and 14 (red points) were 254 and 13 deaths, respectively. there is some controversy 1 about the reported numbers from china on these days, the reason why we are considering the average number for these days. figure 2 shows the observed and predicted death rates (a) and the daily number of deaths (b) in china. the predicted rates and their associated 95% prediction credible intervals were obtained, under a bayesian approach, by considering a non-informative prior and the chinese official death reports. table 1 summarizes the information given in figure 2 . this information will be considered as a prior for modelling the number of deaths in perú with the exception of the p parameter, where a weakly informative prior, n (0, 10 2 ), will be considered for log (p). initial values for the estimation process will be sampled from the prior distribution that was constructed with the chinese data. the chinese data to be used in this work was obtained from the european centre for disease prevention and control, institution that daily publishes statistics on the covid-19 pandemic. the daily death reports in perú, on the other hand, were obtained from the local authorities. taking into account the observed likelihood function, easily derived from (1) and (2) we now briefly describe our predictions for the spread of covid-19 in perú. the time here will be understood to be measured in days after the first reported death in the country. in addition, figure 3 .b shows the expected number of deaths per day and its associated 95% predictive credibility interval. in particular, the expected number of deaths is 17.5 (95%ci: 9 -28) and 19.3 (95% ci: 10 -31) at 20 and 30 days, respectively. overall, the total number of deaths is expected to be 611.6 (95%ci: 604.3 -833.7) persons. the model estimates that the number of days, since the first death, to achieve the threshold death rate will be 63.8 (95%ci: 52.9 -65.84) days. at this time, the 99% of expected deaths will have been observed. the estimated inflection point is 25.9 (95%ci: 25.1 -26.8), which means that we expect to spend approximately 26 days before the covid-19 death rate starts to decline after the first reported death case. three different scenarios were considered, each keeping the weakly informative prior lognormal distribution for p. in all cases, the posterior mean for the fitted chinese model was taken as the mean prior for the modelling of data from perú. the first scenario (i) takes also the posterior chinese variances as the prior peruvian variances. the second scenario (ii) induces flexibility in the prior distribution by increasing their corresponding standard deviations by a factor of five. finally, the third scenario (iii) increases the standard deviations by a factor of ten. table 2 and figure 4 shows that the point estimates are not heavily affected, but the precision pays the price of not having yet enough data from perú. considering the information on daily number of deaths from china and from perú, our estimation of total number of deaths will be 611.6 (95% ci: 437.2 -833.7) and 99% of those deaths will occur 63.8 (95% ci: 52.9 -65.8) days after the first death reported case. the present study has some limitations. although perú has not followed all the measures taken in china, we assumed that peruvian interventions will have similar effects as the ones observed in that country. however, we expect our model is flexible enough to be driven by the peruvian data. furthermore, the model will increase its precision as more data from perú becomes available. several extensions are possible for the model. for instance, covariates as daily social mobility indicators or country age distributions can be included in the model, in particular on the p parameter that measures the total number of deaths. we expect our model can be useful to guide some policies that need to be taken by the peruvian government in order to overcome the covid-19 pandemic. for example, the proposed model can be useful to measure the impact of the covid-19 pandemic on the we would like to thanks rodrigo carrillo larco for providing us the detailed information of stan: a probabilistic program-9 ming language forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months rstan: the r interface to stan forecasting the worldwide spread of covid-19 based on logistic model and seir model key: cord-290687-kc7t1y5o authors: ray, soumi; roy, mitu title: susceptibility and sustainability of india against covid19: a multivariate approach date: 2020-04-21 journal: nan doi: 10.1101/2020.04.16.20066159 sha: doc_id: 290687 cord_uid: kc7t1y5o purpose: we are currently in the middle of a global crisis. covid19 pandemic has suddenly threatened the existence of human life. till date, as no medicine or vaccine is discovered, the best way to fight against this pandemic is prevention. the impact of different environmental, social, economic and health parameters is unknown and under research. it is important to identify the factors which can weaken the virus, and the nations which are more vulnerable to this virus. materials and methods: data of weather, vaccination trends, life expectancy, lung disease, number of infected people in the pre-lockdown and post-lockdown period of highly infected nations are collected. these are extracted from authentic online resources and published reports. analysis is done to find the possible impact of each parameter on covid19. results: covid19 has no linear correlation with any of the selected parameters, though few parameters have depicted non-linear relationship in the graphs. further investigations have shown better result for some parameters. a combination of the parameters results in a better correlation with infection rate. conclusions: though depending on the study outcome, the impact of covid19 in india can be predicted, the required lockdown period cannot be calculated due to data limitation. the entire world has almost stopped theoretically in the month of march 2020. this is one of the most unexpected and unbelievable situation in world's history. a virus, starting its journey from wuhan city of china in december 2019, has now reached almost all major cities and has created colonies very rapidly. as per world health organization (who), the first case of covid19 was identified on 8th december 2019 [1] . initially the disease was misunderstood as some variation of influenza. scientists, researchers and doctors, after doing continuous analysis, then came up with information about this novel corona virus. though the drug is not yet in the market, the structural details of the virus are now known to us [2] . because of its similarities with the behavior of severe acute respiratory syndrome (sars) corona virus, this virus was named as severe acute respiratory syndrome coronavirus 2 (sars-cov-2) and the disease was identified as coronavirus disease (covid19) by who [3] . seasonal diseases which have higher mortality rate usually belong to sars category. observing the high infection rate, on11th march 2020, who declared covid19 as pandemic [4] . in 2003, asian countries were badly affected by sars epidemic which originated in china. the worldwide death toll was 774, having a ratio of 1:10 of registered cases [5] . in 1957, asian flu virus claimed around thousand lives in india [6] . one of the severest global pandemic in the history of recent past was 1918 flu. this flu was first reported in a spanish newspaper, and it infiltrated india through bombay port. the disease was contagious. it claimed 400 million lives worldwide [7] . it is claimed that one third of the population was infected. different articles claimed china as the origin of the flu [8] [9] [10] . but death in china itself was very few, whereas india lost almost one fifth of its population [11] . the estimated death toll was 14 million [12] . the flu was so deadly, that it brought the population down for first time as well as the last time, till date in the history of india. but the virus disappeared almost suddenly after few months. it is assumed that like any other pathogen, this virus also rapidly mutated to a lesser lethal strain and then finally died out [5] . covid19 has many similarities with these 1918 flu and 2003 sars. all are viral and contagious infections, epidemic in nature, turning into pandemic within few weeks, transmitted through droplets and very easily transmissible from human to human. it is also suspected that coronavirus can be transmitted through air and it can survive in environment without any decrease in its efficiency for a long time and thus the chance of infection increases [13] . if coronavirus is transmissible and airborne, this is an alarming situation for india. normally during season change, indians suffer from different common ailments like cold and cough, nasal congestion, conjunctivitis etc. many of these diseases are infectious and transmissible. a large part of the population suffers from one or more of these issues commonly. along with those common epidemics, covid19 has to be taken care of. due to nor'westers during march/april the chances of fast spreading of covid19 is also high. many people are already infected, among which all are not having significant symptoms and hence not identified. finding ways to stop community transmission, case identification at initial stage and controlling the death rate is an emergency. even the cure will not be easy to save the world if the infection is not prevented. faster and easier worldwide transportation system has turned into a curse in case of viral epidemics. due to international travels, almost all the countries in the world have got infected by this transmissible virus within a very short duration simultaneously. covid19 has affected 209 countries and territories around the world as of 8th april 2020. from the information of the impact of 1918 pandemic in india, we have tried to understand the underlying facts. a large population who lived near and below poverty line got affected by the 1918 pandemic [14] . apparently, it seems that, the sanitation had a significant relation with the disease infection, spread and severity. but the high mortality rate in case of covid19 even in developed countries raises a question on this easy assumption. light from a different angle may have some answer to it. people from lower economic segment not only failed to maintain sanitation but also suffered from improper diet. lack of consumption of proper and healthy food weakened their immunity. before 1918, no vaccine other than for smallpox was available. no vaccine or antibacterial was invented to prevent the pandemic diseases. it is not very difficult to assume that the practice of vaccination among the poor people in india was also low. so, hygiene was not solely responsible for the devastating death toll of 1918. immunity also played a bigger role in it. in this article, we have tried to give some insight from available worldwide information, in order to understand the nature of this new coronavirus infection which is causing covid19. we have tried to inspect its dependencies on other known parameters. a measure of the possible effects of different parameters on the outbreak and infection growth has also been estimated in order to understand the risk in india. the scientists, from different parts of world, already have done researches and have published valuable information. all the important outcomes have been considered and examined before concluding our findings. a common drawback is associated with most of the reported works. they have discussed impact of single dimensions like temperature, vaccination or life cycle of virus. this has restricted the scope of understanding of the virus's overall activities and limited the chance of prediction and prevention. we have tried to overcome this limitation by replacing univariate analysis with multivariate approach. this article has considered different possible aspects to get a robust outcome. to keep the result as unbiased as possible, we have collected data of multiple cities and countries all over the world having different geographical locations and climatic conditions. we have conducted analysis of several relevant factors to look into the situation from all possible corners. possible dependency of the number of total infection and death has been examined against the environmental conditions taking different weather parameters as independent variables. for this purpose, data of 43 cities all over the world has been considered. these cities are significantly affected by sars-cov 2. we have collected the covid19 related data from who site and other data from particular websites for each individual type of parameter to minimize the biases. the duration considered to check dependency of weather parameters is from 1st to 27 th march, as because by 1 st of march, a large number of countries got significantly infected. the days have been limited to 27 th , as by 3 rd week of march majority of the countries applied social distancing and isolation. data beyond that time may have significant impact of isolation. isolation includes noise in the measure of infection and death, as it puts restriction in virus transmission due to lower availability of hosts. the weather information of all those locations have been collected from a single website [15] to keep it uniform even if the information includes any noise or bias. we also have taken a measure of the population to compare the infection rate. human to human infection depends on the population for transmission. other than these affecting parameters, another checking has been done on the impact due to the lockdown. how the duration of lockdown has been affecting the number of new infection, have also been examined to understand its importance. we have considered life expectancy also to inspect its impact on the number of infected cases and deaths. the life expectancy includes the impact of different parameters like average living standard, socioeconomic situation, health service qualities, natural calamities etc. a comparison with life expectancy refers to be have a relation with all those hidden parameters though the insights of each are not accessible. the data is collected from united nations development program reports [16] . as proposed in a paper [17] , vaccination may have great impact on death. in this article, the data of bacillus calmette-guérin (bcg) vaccination has been compared with present death rate of different countries. this observation has a significant impact in our final outcome. this data is collected from who and review articles [18] [19] [20] . the additional factors included in this study are the impact of lung cancer (lc), chronic obstructive pulmonary disease (copd) and lower respiratory infect (lri). these diseases have shown an impact on death rate in many countries which are badly affected by coronavirus. the required data are retried from online resources [21] . . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . the total number of cases per million and total death per million are very much correlated as per our examination (ρ=0.69, p= 6.2e-31) as of 12 th april 2020. because of the good correlation, any of these two parameters can be used for prediction analysis and the outcome will remain comparable. in this article we have used either of these parameters to understand the impact of all other parameters on covid19 and the results are compared later. we have divided our test result into different parts. in the first part we have discussed the impact of different weather parameters on the number of infected cases. we have not considered active cases because the number is ever changing with continuous addition of new cases, elimination of recovery numbers and deaths. initially the number of infected cases increased rapidly due to lack of awareness, availability of more hosts, free movement of hosts etc. and then gradually decreased for enforcement of quarantine, growing awareness like washing our hands regularly, social distancing etc. these qualitative parameters are not traceable but have significant impact on the number of new cases. different cities, selected in this article, have different geographical locations with widely varying atmospheric condition. if any of the considered weather parameters have a significant effect on infection transmission, then that will impact the transmission equally, irrespective of the location. the list of the cities with parameters details is given in the appendix. we have taken data per million to nullify the population bias. as the virus can be transmitted from human to human and can travel a small distance through droplets with airflow, higher population increases the availability of new hosts and hence the rate of infection. our target is to inspect the significance of that impact. this study is very important for india as its population is very high. the correlation between number of registered infected cases per million and the different weather parameters like temperature measures, humidity, dew points and precipitation have been found. we have used pearson's correlation to check the dependency of identified case numbers with the other parameters. we have used linear regression correlation to understand the linearity in the relationship, its statistical significance and to make a prediction. the results are presented in tabular form in table 1. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . the p-value in each case is high enough to support null hypothesis and reject any considerable correlation. but we decided to inspect further. each parameter is examined separately to find any possible impact of covid19 deaths. this method offers exciting information about lowest temperature and highest temperature. we have considered death of 0.01% of population as the threshold. above this rate is considered as a matter of concern. as per central intelligence agency, present population of india is 1,326,093,247 [22] . the 0.01% of this population is about 1,32,609 which is a few thousand more than present death count all over world and not an ignorable count. the countries which experienced higher death rate (as well as death) had the minimum temperature below 0°c as shown in figure 1 . the y-axes are changed to logarithmic scale to enhance the datapoint visibility. the linear trend lines are also shown in the figures for visual depiction of our understanding, that is with increase in minimum temperature the death rate and total death (hence infection rate) decreases. on contrary, for highest temperature of a place, a range of the temperatures has shown higher risk of death due to covid19. the countries with high death rate had highest day temperature in between 17 to 25 degree centigrade as per figure 2. we have given a try to understand the effect of lockdown. how this lockdown is impacting the community transmission is an important study. the entire world is depending on this policy in absence of vaccines. a detailed study of this factor will help us to predict the required lockdown duration for india. we have taken the number of new registered cases of different countries at the starting of lockdown and then after completion of each week till 9 th april 2020. we have considered the total change per week to reduce the noise of a temporary change in rate. a sudden change in the number for one or two days without any consistency does not portray any significant change of the overall situation of a country. the result is presented as graph in figure 3. only norway and italy have entered into the 5 th week of lockdown and both countries have shown a drop in new cases after 4 th week. but the steadiness is which is yet to be known, and it is highly required to conclude in a positive note. italy has completed 4 days of 5 th week on 9 th april 2020. the new cases registered in this duration are 14314, which is a little lower rate than the rate per day of 4 th week of lockdown. austria and australia have shown notable reduction after 3 weeks of lockdown. after 4 th day of the 4 th week, the average per day new cases is still very low in austria. australia has just completed 1 day of next week as on 10 th april. though per day rate is lower than last week but it is higher for new cases registered in each of the last 2 days. though no conclusion can be drawn about prevention of infection through lockdown, a prospective evidence of slowdown in the rate of spread of covid19 pandemic is available. for a decisive interpretation the impact of lockdown needs to be observed for few more weeks. cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.16.20066159 doi: medrxiv preprint life expectancy has no direct impact on infection growth and death due to that. but life expectancy is connected with different facts as discussed in introduction. along with those, depression, low average education and unemployment also have an impact on the life span of a community. to have a measure of all such hidden factors, we have checked the relation between life expectancy and death rate. here we have taken death per million to compare the change in it with respect to life expectancy. the graph is shown in figure 4 . the surprising observation is that, the high death rate is mostly associated with high life expectancy. inspection of causation is required to understand the actual impact of this finding. this is beyond the scope of our present study and can be considered during advance analysis. not all the affected countries are equally prone to covid19 as per the report of who. though our preceding analyses have failed to find any strong relationship between the discussed parameters and this pandemic, the presence of some factors influencing the rate of spread of this infection and death rate is obvious. a literature [17] showed some hope in its study of relation between covid19 and bacillus calmette-guérin (bgc) vaccination. we have repeated that study to find the relation with present scenario. the work was reported in the middle of march when many countries were not as affected as of today. the scenario in usa has changed dramatically in the last 3 weeks; india also has shown significant increase in the number of infected cases in the last 15 days. hence, the repetition of the analysis is necessary before concluding any decision. our study has shown that vaccination has good impact in most of the cases but there are many exceptions too. the comparison is presented in the figure 5. france, iran, ireland, portugal and sweden have significant death toll even after having good history of vaccination. on the contrary, australia and canada have quite low death rate without vaccination program. hence the hypothesis of any direct relation between bcg vaccination and covid19 has been rejected. a further, in depth analysis including more details of vaccination program, coverage of the population, specially in the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . below poverty level population, is required to find the exact relation. a low level dependency of death rate on bcg vaccination is visible in the graph which is supported by statistical analysis of correlation resulting in ρ = 0.382 and p-value = 0.054. covid19 is a severe acute respiratory syndrome (sars) disease. it attacks human lungs, creates trouble in breathing and gradually becomes deadly to claim lives. to understand its relation with other lungs disease, a study has been conducted. top 10 diseases of each country are examined to see the burden of other lung diseases. after preliminary scrutiny, we decided to consider lung cancer (lc), chronic obstructive pulmonary disease (copd) and lower respiratory infect (lri). data cleaning and preprocessing was done before analysis. the rank of these diseases which are among the top 10 and primarily responsible for death in a country has been considered for further analysis. if a disease is not listed in a country's top 10 diseases, a numerical value 0 has been considered. the correlation and p-value of these 3 diseases with death rate are ρ = 0.535, p-value = 0.06. the p-value depicts a higher tendency towards null hypothesis though the correlation is average. the relation lc alone is better with death rate. with a ρ = 0.477, p-value = 0.013 lc demands a command over death rate due to covid19. this finding does not offer any way of prevention of the pandemic but gives an idea about the risk of a country. none of the above discussed parameters has significant effect covid19 except lungs cancer. though few have shown impact in increase in number of cases (or deaths) in some cities, nothing convincing has been found. it seems like the disease gets transmitted with almost equal potential in different atmosphere. impact of lockdown is also not considerably good to get any definite suggestion. hence the question remains unsolved. how india is going to response to this pandemic? negative minimum temperature, a specific range of maximum temperature, lack of bcg vaccination and tendency of other lungs diseases have shown some positive impact in increasing the number of covid19 cases and death. we have combined all these four parameters to see their combined effect on death rate. before statistical analysis, we have done preprocessing of each parameter to create four distinguished features. the temperature data are analogous in nature having no significant impact in case of a minute change. using the already acquired knowledge from the previous analysis, we have classified the temperature into two different classes. in case of minimum temperature, values equal to or below 0°c are considered as one class which has strong negative impact on death and hence represented by 10. rest of the temperature values which have less impact or no impact on death are considered as 1. similarly, maximum temperatures between 17 to 25°c are converted to 10 and rest are to 1. disease scores are divided into 3 classes -10 to 6, 5 to 1 and 0. score greater than 5, represents disease rank within top 5. for these ranks a disease is represented by 10. scores from 5 to 0 means the disease is ranked between 6 to 10 and rank below 5 is presented by 5. when a disease is not listed in top 10 diseases of the corresponding country, it has been represented by 0. this data preprocessing steps are important for better insight. vaccination is represented by number of the years the program is continued in a country since 1980 as mentioned before. after creating a clean dataset with these four features, statistical analysis was done to check if any useful correlation is present. this analysis offers a prominent correlation, ρ = 0.634 with high acceptance, p = 0.024. the features are plotted in figure 7 along with increasing death rate. in the figure, singapore, australia and france have shown exception in vaccination impact. the probable reason can be the temperature. both singapore and australia are hot countries. australia is not prone to any lungs disease and singapore has low tendency of lc. on the other hand, france has low temperature and high impact of lc in country's death rate. hence, a significant impact of these parameters can be assumed, and this needs further research for definite conclusion. in april, temperature remains significantly high in india (non-hill zones). in most of the areas, specially in the cities which are reporting high rate of cases and death, the lowest temperature remains higher than 15°c and maximum temperature goes well above 30°c. india also have different vaccination program for years and bcg vaccination is done for almost 90% of the population. the negative factor is that the lung diseases are very common in india. copd (rank 2), lri (rank 5) and tuberculosis (rank 6) are major causes of death here, though lc is not that common like other countries which are badly affected by covid19 pandemic. this information and data dependent statistical analysis is not self-sufficient to understand the nature of coronavirus. along with this geographic, demographic and meteorological analysis, information from other branches of science like virology, biotechnology must be considered. an important finding about such pandemic is their sudden disappearance after few months. it happened every time in the world's history. not considering the expected life of sars-cov2 with respect to environmental conditions and continuous mutation will be impractical. the faster the virus will selflimit itself, the earlier the rate of infection will go down. india has imposed quarantine through locked down for 21 days starting from 23 rd of march 2020. in last 15 days the number of cases as well as death has increased significantly. the lockdown do restrict the community transmission but the rate is increasing may be because of detection of already infected cases. fast identification of old cases is required to access the effect of isolation on infection transmission rate. the mortality and morbidity ratio may be affected by the immune system of a population and will vary with different life style factors starting from food habits, common diseases, vaccination programs etc. in the high altitude areas (mainly the himalayan region) with low temperature throughout the year, the risk is higher as per our analysis. february, march are not tourism season of these himalayan region of india because of chilling cold, road blockage due to snowfall and for academic session ending with examinations. probably due to little tourist flow, these regions are still not reporting cases of covid19. once the severity of pandemic will fall and the country will start resuming its normal life, the free movement of tourists can be a threat for those hilly areas and it could ignite a reappearance of the disease in cities too through the returned tourists. a proper protection plan and . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . restricted movement for reasonably longer time after the pandemic is required to keep the citizens safe. to summarize the risk of india with more accuracy, we have to consider the socioeconomic condition too. a good vaccination program, history of having seasonal endemics ensures better immunity against similar diseases. sars-cov-2 is a new virus and hence the old antibodies cannot completely prevent it. but whether any antibody has any significant impact on its growth or not is under research. cyclicity is ubiquitous for acute infectious diseases [23] . each disease is unique on its own. the sars diseases occupy the span of two months, march and april, of indian epidemic calendar. the reason is that the seasonal variation has significant impact on transmission of infectious diseases. this is known as seasonal forcing. if the vaccine of tuberculosis is resistive for covid19, then the already present antibodies in the blood will reduce the severity of the disease and death rate in india. the epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19) in china features, evaluation and treatment coronavirus (covid-19). in: statpearls [internet covid-2019)-and-the-virus-that-causes-it] 4. who director-general's opening remarks at the media briefing on covid-19 -11 sars (severe acute respiratory syndrome the 1957 pandemic of influenza in india did the 1918-19 influenza pandemic originate in china? population and development review what happened in china during the 1918 influenza pandemic? flu pandemic that killed 50 million originated in china, historians say flu experts warn of need for pandemic plans mortality from the influenza pandemic of 1918-1919: the case of india aerosol and surface stability of sars-cov-2 as compared with sars-cov-1 estimation of potential global pandemic influenza mortality on the basis of vital registry data from the 1918-20 pandemic: a quantitative analysis correlation between universal bcg vaccination policy and reduced morbidity and mortality for covid-19: an epidemiological study connecting bcg vaccination and covid-19: additional data global, regional, and national burden of tuberculosis, 1990-2016: results from the global burden of diseases, injuries, and risk factors soper he: the interpretation of periodicity in disease prevalence the first author, soumi ray, ph.d. in image analysis from the indian institute of technology roorkee, has done the data analysis, explored useful insights, prepared the manuscript and concluded the work. the second author, mitu roy, b.tech in computer science from haldia institute of technology, conceived the content and retrieved the data to help the first author. none. both the authors declare to have no conflict of interest. none. key: cord-288678-ptvaopgj authors: li, jing; wang, lishi; guo, sumin; xie, ning; yao, lan; cao, yanhong; day, sara w.; howard, scott c.; graff, j. carolyn; gu, tianshu; ji, jiafu; gu, weikuan; sun, dianjun title: the data set for patient information based algorithm to predict mortality cause by covid-19 date: 2020-04-24 journal: data brief doi: 10.1016/j.dib.2020.105619 sha: doc_id: 288678 cord_uid: ptvaopgj the data of covid-19 disease in china and then in south korea were collected daily from several different official websites. the collected data included 33 death cases in wuhan city of hubei province during early outbreak as well as confirmed cases and death toll in some specific regions, which were chosen as representatives from the perspective of the coronavirus outbreak in china. data were copied and pasted onto excel spreadsheets to perform data analysis. a new methodology, patient information based algorithm (piba) [1], has been adapted to process the data and used to estimate the death rate of covid-19 in real-time. assumption is that the number of days from inpatients to death fall into a pattern of normal distribution and the scores in normal distribution can be obtained by observing 33 death cases and analysing the data [2]. we selected 5 scores in normal distribution of these durations as lagging days, which will be used in the following estimation of death rate. we calculated each death rate on accumulative confirmed cases with each lagging day from the current data and then weighted every death rate with its corresponding possibility to obtain the total death rate on each day. while the trendline of these death rate curves meet the curve of current ratio between accumulative death cases and confirmed cases at some points in the near future, we considered that these intersections are within the range of real death rates. six tables were presented to illustrate the piba method using data from china and south korea. one figure on estimated rate of infection and patients in serious condition and retrospective estimation of initially occurring time of corid-19 based on piba. the data of covid-19 disease in china and then in south korea were collected daily from several different official websites. the collected data included 33 death cases in wuhan city of hubei province during early outbreak as well as confirmed cases and death toll in some specific regions, which were chosen as representatives from the perspective of the coronavirus outbreak in china. data were copied and pasted onto excel spreadsheets to perform data analysis. a new methodology, patient information based algorithm (piba) [1] , has been adapted to process the data and used to estimate the death rate of covid-19 in real-time. assumption is that the number of days from inpatients to death fall into a pattern of normal distribution and the scores in normal distribution can be obtained by observing 33 death cases and analysing the data [2] . we selected 5 scores in normal distribution of these durations as lagging days, which will be used in the following estimation of death rate. we calculated each death rate on accumulative confirmed cases with each lagging day from the current data and then weighted every death rate with its corresponding possibility to obtain the total death rate on each day. while the trendline of these death rate curves meet the curve of current ratio between accumulative death cases and confirmed cases at some points in the near future, we considered that these intersections are within the range of real death rates. six tables were presented to illustrate the piba method using data from china and south korea. table subject death rate estimation using normal distribution, of mean, standard deviations and formulas. the data estimation focuses on the early estimation of death rate of infectious diseases, in particular, the disease covid-19 caused by 2019-ncov. collected data are formatted on excel spreadsheets for analysing. data include the total number of patients, total number of deaths, daily numbers of new patients, daily number of new deaths, from starting data of official report to the presented time, e.g., march 22, 2020. data were collected through the cyberlinke of each official websites and copied and pasted the desired data onto excel spreadsheets.  these data provide the scientific community with a new methodology to estimate the death rate and then predict the death cases during an epidemic.  scientific researchers, cdc employees, government officers for disease control and management, and public population, will benefit from these data.  these data will be very useful for the studies with the purpose either of disease control management or of related sources preparation to combat against an outbreak.  due to the limited amount of data samples collected in this article, some factors, such as the phases of an outbreak and the measurements issued by the department of disease control that might impact the death rate of an epidemic, could be taken into for further insights and development of experiments with a large amount of data. chd-coronary heart disease the data of 33 death cases in table 1 have been collected from the official website of the health commission of hubei province in china, which include the date that patients have onset of symptoms, the date that patients began to be taken into icu and the date of decease. with these data, the days both from symptoms appearance to death and from icu intake to death can be calculated. following normal distribution, the mean score μ and standard deviation σ can be calculated either. thus the 5 selected scores (μ, μ ± σ and μ ± 2σ) in normal distribution can be obtain as the basic elements for the following estimation and prediction of death rate, which are respectively 2, 8, 13, 19, 25 days. the disease information in table 2 has been collected from the public media before we resume data analysing with the same method of death rate estimation and prediction in south korea as in china [1] . we have collected accumulative confirmed cases and deaths and then new confirmed cases and new deaths in south korea. death rate 1 from the date symptoms 2020-03-15 2020-03-14 2020-03-13 2020-03-12 2020-03-11 each score we selected in normal distribution has a specific possibility when we take them into consideration of representatives in bell curve [1] . when we weighted each death rate on a day with their corresponding possibilities and then sum, the total death rate on each day can be obtained. each curve consisting of several death rate will have a trendline and thus a formula to describe this trend as well as the current ratio between accumulative death cases and confirmed cases on each day (table 4 ). 2020-03-15 2020-03-14 2020-03-13 2020-03-12 2020-03-11 current ratio between accumulative death cases and confirmed cases the current ratio between accumulative death cases and confirmed cases is calculated by dividing accumulative death cases with accumulative confirmed cases on each day. the intersect points of three trendlines intersect 1 intersect 2 death rate in south korea 0.92% 1.06% the trendlines of death rate 1 and death rate 2 tend to intersect with the trendline of the current ratio finally, because the current ratio will be the real death rate at the end of epidemic. we considered that the intersection value of three trendline (death rate1 and 2, current ratio) will drop in the range of real death rate. when we calculated the death rate separately with the corresponding formula of their trendlines, two intersections have been acquired (table 5-b) . we pick the maximum value between them to predict new death cases in the following days (table 6 ). tables are produced based on the patient information based algorithm (piba) [1] . piba has been adapted when estimating the death rate of covid-19 in real-time with publicly posted data. following normal distribution, the different durations with different possibilities between symptom appearance and death have been derived from analysing 33 death cases in wuhan city of hubei province in china [2] . based on these results, the total death rate in regions can be calculated specifically by putting in the different death rates with different durations together. while the trendline of these death rate curves meet the curve of current ratio between accumulative death cases and confirmed cases at some points in the near future, we considered that these intersections are within the range of real death rates. the data analysis was all following normal distribution, either in calculating the possibility of every selected score or in estimating the death rate. after collection of data of covid patients from south korea, the data was analysed with piba method as indicated above ( table 2 ). the death rate was first estimated ( table 3 ). the death rate then was calculated (table 4 ). following estimations, the piba method then was used to predict the number of deaths in the following week (table 5 ). the predicated death numbers then were compared to the real death numbers (table 6 ). considerably lower than expected. prior expectation has been much higher, based on multiple infectious routes [3] [4] . using our formula, the results indicate that the current infectious rate is even lower than the rate based on the total numbers (see fig. 1a ). the infectious rate in hubei province is currently around 4%, although previously the rate was as high as 39%. on average, the infectious rate overall in china is about 4%, while in hubei it is 10%. in the rest of the country, it is 0.46%. among the inpatients, the rate in serious medical condition ranges from 10% to 30% (see fig. 1b ), while it averages at 18% in china, 19% in hubei, and 13% in the rest of country (except hubei). based on the estimated death rate, on january 22, there should be a total of 150 to 300 inpatients (see fig. 5c ). based on the rate of patients who are severely ill among all patients, on january 2, there should be 216 to 315 patients. based on the effective infection rate and based on the assumption of one week or 14 days from close contact to the onset of symptoms, there might be 2,160 to 68,478 people who were infected around december 20, 2019. if we believe the epidemic doubling time is approximately 6 days, the initial infection source may date back to as early as november or october 2019. dianjun sun. real-time estimation and prediction of mortality caused by covid-19 with patient information based algorithm clinical features of patients infected with 2019 novel coronavirus in wuhan nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study this work was partially supported by funding from merit grant i01 bx000671 to wg from the the authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article. key: cord-281406-d7g0pbj4 authors: chen, yifei; zhao, meizhen; wu, yifan; zang, shuang title: epidemiological analysis of the early 38 fatalities in hubei, china, of the coronavirus disease 2019 date: 2020-04-24 journal: journal of global health doi: 10.7189/jogh-10-011004 sha: doc_id: 281406 cord_uid: d7g0pbj4 background: since the emergence of coronavirus disease 2019 (covid-19) in hubei province of china by the end of 2019, it has burned its way across the globe, resulting in a still fast-growing death toll that far exceeded those from severe acute respiratory syndrome (sars) in less than two months. as there is a paucity of evidence on which population is more likely to progress into severe conditions among cases, we looked into the first cluster of death cases, aiming to add to current evidence and reduce panic among the population. methods: we prospectively collected the demographic and clinical data of the first 38 fatalities whose information was made public by the health commission of hubei province and the official weibo account of china central television news center, starting from 9 january through 24 january 2020. the death cases were described from four aspects (gender and age characteristics, underlying diseases, the time course of death, symptoms at the incipience of illness and hospital admission). results: among the 38 fatalities, 71.05% were male, and 28.95% were female, with the median age of 70 years (interquartile range (iqr) = 65-81). persons aged 66-75 made up the largest share. twenty-five cases had a history of chronic diseases. the median time between the first symptoms and death was 12.50 days (iqr = 10.00-16.25), while the median time between the admission and death was 8.50 (iqr = 5.00-12.00) days. in persons aged over 56 years, the time between the first symptoms and death decreased with age, and so did the time between the admission and death, though the latter increased again in persons aged over 85 years. the major first symptoms included fever (52.63%), cough (31.58%), dyspnea (23.68%), myalgia and fatigue (15.79%). conclusions: among the death cases, persons with underlying diseases and aged over 65 made up the majority. the time between the first symptoms and death decreased with the advanced age. in all the age groups, males dominated the fatalities. viewpoints research theme 6: covid-19 pandemic rocketing up, the death toll of the coronavirus disease 2019 (covid19) outbreak has overtaken that of the severe acute respiratory syndrome (sars) during the 2002-2003 epidemic. the covid-19 outbreak has wreaked havoc on all sectors in china, resulting in city lockdown, traffic restrictions, work shutdown, and school cancellation, etc., first in wuhan, then later in many other cities. many countries have imposed travel restriction, suspended flights, and barred entry of chinese nationals. the sudden shock of the covid-19 has had a significant impact on the chinese economy [2] . containment of virus transmission has become a top priority to global public health security. researchers have been racing against time since the outbreak of the covid-19, as little was known regarding covid-19 virus initially. two viral genome studies had indicated that the novel virus is closely related to sars-cov (one research revealing 79.5% and the other 89.1% nucleotide similarity, respectively) [3, 4] , which is reminiscent of the calamitous sars outbreak 17 years back. aside from viral genome studies, researchers also looked into the clinical features and epidemiologic characteristics of covid-19 cases. clinical manifestation of covid-19 ranges from mild symptoms (low-grade fever, fatigue, sore throat, etc.) that resemble a common cold [5] , to severe and even fatal respiratory diseases such as acute respiratory distress syndrome [6] . a study that collected more than 70 000 cases across china also reported asymptomatic cases of covid-19 virus infection, accounting for 1.2% of total confirmed cases [7] . after emergence, the virus spread rapidly through human-to-human transmission [8] , which was substantiated by a modeling study from los alamos national laboratory indicating the median basic reproductive number (r0) for covid-19 virus was 5.7 (95% confidence interval (ci) = 3.8-8.9) [9] . the mechanism behind the high infectivity of the covid-19 virus could be explained by a study revealing that covid-19 virus spike glycoprotein had around 10 to 20-fold higher affinity with angiotensin converting enzyme ii (ace2) receptor that is widely distributed in human organs, than sars-cov spike glycoprotein [10] . in addition, the covid-19 virus spreads mainly from person-to-person contacts via respiratory droplets or contact with infected surfaces or objects [11] . with a combination of high infectivity and easy transmission, the covid-19 virus poses a great threat to anyone who has close contacts with an infected person, especially within families and to frontline medical staffers [12] . to make things worse, transmission from asymptomatic patients was confirmed by a case report covering a german businessman being infected by his asymptomatic chinese business partner from shanghai [13] . who issued a warning against possible transmission of covid-19 virus from infected people before they developed symptoms. these findings have raised concerns across the globe, sounded the alarm of a dire situation, and prompted authorities to ramp up quarantine measures. to what extent covid-19 kills remains vague, as literature in terms of case fatality rate is in scarcity. based on the data compiled by who, the overall case fatality rate of covid-19 globally was initially estimated at around 2% [14] , similar with an overall case fatality rate of 2.3% from a study which collected more than 70 000 cases in mainland china as of 11 february 2020 [7] . however, the two case fatality rate figures were much lower than the result yielded from the study of wang et al. on a case series of 138 consecutive hospitalized covid-19 patients (mortality: 4.3%) in a hospital in wuhan, china [15] . the reason that the study of wang et al. had higher case fatality rate can be attributed to the large scale of the infection in the epicenter of the outbreak (more than 40 000 cases in early february 2020), the heavily strained medical system, the lack of protective suits and medical equipment (such as masks, goggles, gloves, and disinfectants). based on the fact that the death toll of covid-19 topped that of the sars outbreak during 2002-2003 in less than two months, covid-19 virus infection will deal a more substantial blow to the globe and can be more fear-mongering, than sars. a look into deaths cases may provide more information to the public and sooth panic, as studies suggested that misinformation and inadequate information contribute to unnecessary public panic and subsequent undesirable responses [16, 17] . as there is a paucity of evidence on which population is more likely to progress into severe conditions among covid-19 cases, here, we poured over the first batch of 38 death cases whose information were made public by health commission of hubei province as of 24 january 2020, one day into city lockdown in wuhan, with the purpose to add a new facet to current evidence. data of covid-19 death cases in hubei were extracted prospectively from the website of health commission of hubei province [18] and the official weibo (china' s equivalent of twitter) account of china central television news center [19] , starting from 9 january 2020, when the first deceased patient was reported, through 24 january 2020, when the 38 th was registered. since 25 january 2020 the number of death cases has been surging, and the health commission of hubei province has stopped making public the information of death cases. therefore, data collection was terminated at that point. microsoft excel 2016 (microsoft, redmond, wa, usa) and spss 23.0 software (ibm corp., armonk, il, usa) was used for data analysis. the death cases were described from four aspects (gender and age characteristics, underlying diseases, death time distribution, and symptoms at the incipience of illness and hospital admission). frequencies (%) and median (interquartile ranges [iqr]) were used to describe the data. as of 24 january 2020, the overall case fatality rate for covid-19 was 5.3% in hubei. among the fatalities, there were 27 males, and 11 females, with a male to female ratio of 2.45:1. the youngest age was 36 years, and the oldest age was 89 years, with the median age being 70 years (iqr = 65-81). the median age for females and males both stood at 70, though iqr ranged from 66 to 80 for the former, and from 65 to 81 for the latter, respectively. distribution of 38 fatalities by genders and age groups was shown in figure 1 . there were 14 cases aged 66-75 years, making up the largest share of 36.84%. coming next was 10 cases aged 76-85, accounting for 26.31%. the same pattern was found for genders, with 66-75 years forming the largest share of 33.33% in males, vs 45.46% in females alone, and 76-85 years the second largest (22.22% in males, vs 36.36% in females alone). among the death cases, 25 had underlying diseases, including 16 males and nine females, accounting for 65.78% of the total. there were 15 cases of hypertension, 11 cases of diabetes, four cases of coronary heart disease, three cases of chronic bronchitis, two cases of cerebral infarction, and two cases of parkinson disease. other diseases included chronic obstructive pulmonary disease, tuberculosis, frequent ventricular premature beats, colon cancer, gallstone, cirrhosis, chronic renal insufficiency, fracture, hip replacement, etc., all keeping a tally of one case, respectively ( table 1) . among all the death cases, 17 had one or two underlying diseases, and eight had more than three underlying diseases. among the 38 fatalities, the first case died on 9 january, and the last in the batch died on 24 january 2020, with the period stretching 15 days, during which, the death toll didn't show apparent regularity. to understand the evolution of death, we defined the first symptom day fell on the date on which the patients started to feel their symptoms. we described the period between the first symptom day and the death date as days from the first symptom to death and the period between the date of admission and death as days from admission to death. the median time from the first symptom to death was 12.50 days (iqr = 10.00-16.25). as for male dead patients, the median time from the first symptom to death was 13.00 days (iqr = 11.00-17.00), and for the females, the median time from the first symptom to death was 11.00 days (iqr = 9.00-14.00). the median time from admission to death was 8.50 days (iqr = 5.00-12.00). as for male dead patients, the median time from admission to death was 9.00 days (iqr = 5.00-13.00), and for the females, the median time from admission to death was 7.00 days (iqr = 5.00-11.00). days from the first symptom to death tailed off over the age groups of 56-65, 66-75, 75-85, and >85, while the days from admission to death had a similar pattern over the age groups of 56-65, 66-75, 75-85, but rebounded in persons aged over 85 (figure 2) . 66% of the cases died within nine to 15 days since they felt the first symptoms. fever and cough were the main reported symptoms at the onset of illness among the 38 early death cases. twenty patients first complained of a fever, 12 of coughs, 9 of dyspnea, 6 of chest tightness, 6 of myalgia and fatigue, accounting for 52.63%, 31.58%, 23.68%, 15.79%, and 15.79%, respectively. other symptoms included headache, dizziness, chills, and intermittent diarrhea, each keeping a tally of one case. fever and dyspnea were the main reported symptoms at hospital admission among the 38 early death cases. twenty-five patients complained of a fever, 23 of dyspnea, 16 of coughs, 10 of chest tightness, and 6 had the complaints of myalgia and fatigue, accounting for 65.79%, 60.53%, 42.11%, 26.32%, and 15.79%, respectively. other symptoms are shown in table 2 . as of 24 january 2020, the initial overall case fatality rate in hubei province reached 5.3%. later on, newly reported cases in china saw a sharp rise, but the overall case fatality rate has dwindled. as of 11 february, the overall fatality rate in hubei province was 2.9% [7] , which was far lower than the results of our study. the later declining overall case fatality rate was on one part attributed to the effective treatment of covid-19 as thousands of medical workers from other parts of china poured into hubei province to aid their fellow workers battling the coronavirus. on the other part, there was viewpoints research theme 6: covid-19 pandemic a substantial shortage of test kits at the early stage of the covid-19 outbreak, making it challenging to identify the infected cases [20] . afterward, the test kits were supplied in a large amount, making the number of confirmed patients grow significantly. besides, with a continuous flow of medical resources and personnel into the epicenter and the sweeping screening of infected persons in the communities, the infections were identified and admitted to the hospitals (including fangcang shelter hospitals) speedily, reducing the possibility of becoming severe and preventing the widespread of the coronavirus in communities. we discussed the epidemiological characteristics of 38 cases in the early stage of the disease from the following four parts. gender and age characteristics 71.05% of the deaths were male, considerably more than female, which is consistent with the findings of wang w et al. [21] . single-cell sequencing of covid-19 virus receptors at tongji university found that asian men were more likely to be infected with covid-19 virus [22] , and a study of 8866 cases nationwide also found that the death rate of men was more than three times that of women [23] . the reason that male dominated the fatalities could be explained by the fact that percentage of ace2 level in men is higher than in women [23] , rendering men more susceptible to covid-19 virus. in addition, covid-19 virus-infected people tend to be older ones [24] . in a recent lancet article (15 february 2020) [6] , 53% of the confirmed cases had chronic underlying diseases, and the median age was 55.5 years, indicating that the middle and old aged patients with chronic underlying diseases were more likely to contract the covid-19 virus. from experience, we can see that patients with chronic underlying diseases are indeed more likely to have disease deterioration or even death. among the death cases, persons with underlying diseases and aged over 65 made up of the majority. hence, we developed a speculation that covid-19 could worsen in elderly persons with underlying diseases and even more easily progress to death. this is mainly due to the dwindling immunity in the elderly, especially in those with underlying diseases, which directly renders senior people more likely to be in a state of frailty and more vulnerable to infections [25] , and subsequently leads to worsening of the disease [26] . among the death cases, persons with hypertension and /or diabetes made up the largest share, which could be explained by the fact that hypertension and diabetes top the chronic disease chart in china [27] . ace2 is a crucial regulator of the renin-angiotensin system, and plays a regulatory role in the central regulation of blood pressure and cardiovascular function and could become an attractive target for the treatment of hypertension [28, 29] . covid-19 virus uses the receptor ace2 to enter into target cells [30] , precisely the same as sars-cov. turner et al. [31] found that sars-cov infection affects the function of ace2, so we speculate that the covid-19 virus will also impair the function of ace2, and then manipulate the regulation of blood pressure, and have a negative impact on patients with hypertension. on the other hand, hypertension can cause vascular damage. in patients with hypertension, increased vascular stiffness and decreased elasticity are common, followed by vascular remodeling and stenosis [32] . the pathological results of patients with covid-19 showed that pulmonary vessels endothelial swelling, luminal stenosis and occlusion, leading to acute lung dysfunction [31] . the coexistence of hypertension and covid-19 is a very unfavorable factor to induce lung dysfunction, which is prone to aggravate the condition and even result in death. ace2 gene can also be expressed in the pancreatic islets. a study showed that the binding of sars-cov to ace2 damages islets and causes acute diabetes [33] . covid-19 virus may also exert such a negative effect on islets through the same mechanism. in persons with preexisting diabetes, the damage of islets by covid-19 virus could be more severe, and even fatal [4, 34, 35] . in addition, ace2 is rich in the lungs, heart, kidney, intestine, and testicles, etc. once covid-19 virus gains entry into the human body, more organs could be attacked by the virus through blood circulation over time [36] . therefore, early diagnosis of covid-19 before it progresses into severe conditions is an important measure for older people who have developed a fever and respiratory symptoms [24] . other measures including reducing chances of exposure to infected cases (eg, banning visits to nursing home residents, avoiding gatherings), early isolation and treatment of symptomatic confirmed cases can be beneficial to the elderly population, especially those with preexisting underlying diseases. viewpoints research theme 6: in this study, the number of deaths did not show obvious regularity with time within two weeks. in addition, the covid-19 virus infection rate has been spiking up since 20 january. according to the data of the national health and health commission [37] and the results from our study, we make the following speculation: the cases gradually became infected around the end of december according to a median 7-day incubation period and a median 12.5-day period from the first symptom to death [20] . if we take the later reported maximum incubation period of 38 days into account [38] , a considerable part of them may have been infected in november. besides, the difference in immune resistance between different genders and ages is also an important reason for the irregular distribution of the time of death [39] . however, from figure 2 , it can be concluded that with the increase of age, the days from hospital admission to death and the days from the first symptom to death gradually reduced indeed, which shows that covid-19 poses a great threat to elderly patients [40] . our study also shows that the days from hospital admission to death rebounded in persons aged over 85 years. since there were only 4 cases aged over 85 years, the finding needs further validation from long-term, large-scale cohort studies. studies have indicated that viral infection in the early stage mainly shows upper respiratory tract infection, manifested as fever, headache, and cough [41] . huang et al. found a similar result that 98% of the patients with covid-19 experienced fever, 76% had a cough, and 55% had dyspnea as the first symptom, respectively [42] . however, among the deaths up to 24 january, 52.63% had a fever, 31.58% had a cough, and 23.68% had dyspnea, as the first symptom. it can be seen that not all infected cases have high body temperature as the first symptom, and the temperature change of old people is not very significant compared with young ones even when they have infectious diseases [43] . among the severely infected elderly, 20% ~ 30% of them have no fever or slow response to fever, which is often a sign of poor prognosis [44] , and hinders early detection of infection and brings more potential risks to the elderly. therefore, the early repeated examination is a valid response [45] . however, symptoms changed at the time of admission, with 65.79% of patients showing fever, 60.53% dyspnea, 42.11% cough, indicating dyspnea became the second major symptom. as for the covid-19 virus infection, severe patients will have chest discomfort, progressive dyspnea, or acute respiratory distress syndrome symptoms [46] , which indicates the aggravation of the disease. as a result, the proportion of dyspnea symptoms was slightly higher than other symptoms in our deaths. also, although there were cases with limb myalgia and fatigue, headache, and other initial symptoms came in a small quantity, it does not mean that covid-19 cases presenting the symptoms are in mild condition. there is still the possibility of progression to death, which should arouse the vigilance of medical staff [47] . the emergence of a new infectious disease poses a particular challenge to epidemiologic research, as identifying the characteristics of the disease and infection prevention and control of an epidemic is a step-by-step process. during the period from 17 to 20 january 2020, the number of confirmed cases of covid-19 increased 10-fold [48] , indicating high infectivity of the novel coronavirus [49] . such a disease needs to be contained, or at least the spread of it needs to be reined in time. otherwise, the medical system will face enormous pressure, and a large number of infected patients will inevitably die due to the lack of timely treatment. during the covid-19 outbreak, it is necessary to strengthen the training of medical personnel from all levels of medical institutions, especially those serving at hospitals designated as the treatment center for the disease. at the same time, it is necessary to invest a multitude of resources in outpatient and emergency departments to detect patients to improve the treatment conditions and the capacity to house severe cases. since the elderly and people with underlying diseases are most vulnerable to the attack of coronavirus, and often have serious consequences [50] , it is urgent to ramp up protection and prevention measures for the elderly, especially those with chronic underlying diseases. it also warns us that in the face of an unknown disease, the protection of vulnerable people is essential. this study has some limitations. first, the data of this study came from the panel data of the official website of the health commission of hubei province, so the clinical information of the cases collected is limited. second, as our study focused on the deaths in the early stage of the outbreak, and the fatalities constitute only a tiny fraction of the overall still-hiking death toll, the specific relationship between male and female, and the variations in the time window from onset to death, and from admission to death among different age groups needs more large-scale studies . the 2019-ncov outbreak joint field epidemiology investigation team, li q. notes from the field: an outbreak of ncip (2019-ncov) infection in china -wuhan economic turning point to appear in march despite coronavirus a pneumonia outbreak associated with a new coronavirus of probable bat origin a new coronavirus associated with human respiratory disease in china coronavirus disease (covid-19): symptoms and treatment. clinical manifestation of covid-19 epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study chinese center for disease control and prevention. the epidemiologic characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19) in china clinical characteristics of coronavirus disease 2019 in china high contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. emerg infect dis. 2020. epub ahead of print cryo-em structure of the 2019-ncov spike in the prefusion conformation diagnosis and management of covid-19 (7th version-revised importation of rare but life-threatening and highly contagious diseases. current situation and outlook transmission of 2019-ncov infection from an asymptomatic contact in germany novel coronavirus (2019-ncov) situation report clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in wuhan, china. jama. 2020. epub ahead of print developing pandemic communication strategies: preparation without panic what have we learned about communication inequalities during the h1n1 pandemic: a systematic review of the literature briefing on covid-19 by hubei health commission the official microblog of cctv news center early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia updated understanding of the outbreak of 2019 novel coronavirus (2019-ncov) in wuhan, china single-cell rna expression profiling of ace2, the putative receptor of wuhan 2019-ncov. biorxiv. 2020. epub ahead of print epidemiological and clinical features of the 2019 novel coronavirus outbreak in china. mdrxiv. 2020. epub ahead of print early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study. lancet. 2020. epub ahead of print the burden of frailty among u.s. veterans and its association with mortality evaluation of mhla-dr expression rate on immune function and prognosis in elderly patients with pneumonia study of the prevalence and disease burden of chronic disease in the elderly in china study on the ace2 pathway in the mechanism of exercise reduction in hypertension mechanisms of mas1 receptor-mediated signaling in the vascular endothelium structure of dimeric full-length human ace2 in complex with b0at1. biorxiv. 2020. epub ahead of print vascular fibrosis in aging and hypertension: molecular mechanisms and clinical implications clinical pathology of critical patient with novel coronavirus pneumonia (covid-19) ace2: from vasopeptidase to sars virus receptor binding of sars coronavirus to its receptor damages islets and causes acute diabetes management suggestions for patients with diabetes and novel coronavirus pneumonia the single-cell rna-seq data analysis on the receptor ace2 expression reveals the potential risk of different human organs vulnerable to wuhan 2019-ncov infection. front med. 2020. epub ahead of print china news; national wei jian committee. novel coronavirus pneumonia incubation period averages about 7 days be vigilant! 38 days! asymptomatic! enshi diagnosed cases of extra long incubation period expert group on novel coronavirus pneumonia prevention and control of china preventive medicine association. the latest understanding of novel coronavirus pneumonia epidemiology a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster coronavirus infections and immune responses clinical features of patients infected with 2019 novel coronavirus in wuhan watch out for fever-free older adults most of the old people have no fever clinical course and outcomes of critically ill patients with sars-cov-2 pneu-references monia in wuhan, china: a single-centered, retrospective, observational study single-cell analysis of ace2 expression in human kidneys and bladders reveals a potential route of 2019-ncov infection. biorxiv. 2020. epub ahead of print novel coronavirus outbreak in wuhan, china, 2020: intense surveillance is vital for preventing sustained transmission in new locations preliminary prediction of the basic reproduction number of the wuhan novel coronavirus 2019-ncov coronaviruses: an overview of their replication and pathogenesis key: cord-018486-lamfknpt authors: cina, stephen j.; trelka, darin title: sports-related injuries and deaths date: 2014-02-10 journal: forensic pathology of infancy and childhood doi: 10.1007/978-1-61779-403-2_29 sha: doc_id: 18486 cord_uid: lamfknpt physical activity in children and adolescents should be strongly encouraged. while there is a very low risk of death associated with participation in athletics within this age group, the epidemic of childhood obesity and sedentary lifestyle must be combated to ensure the long-term health and quality of life of today’s youth. sports-related deaths due to trauma are usually readily identified; others require careful examination, adjunctive testing, and/or the expertise of consultants. a thorough investigation of circumstances surrounding the death, review of the medical records, and autopsy is mandated in these cases. the loss of a child is always tragic. when death comes to a young person in peak physical condition engaged in athletics, the fatality strikes a blow to an entire community and often attracts the attention of the national media. in some cases, screening studies or training modifications could have prevented the end result. in many others, however, these sudden, unexpected deaths are the result of conditions that cannot reasonably be anticipated or avoided. the benefits of physical activity in young people are incontrovertible. in fact, a presidential initiative has focused on the necessity for solid nutrition and exercise among american youth. physical activity is essential to well-being, and it must be encouraged in children. childhood obesity has dire health consequences and creates a pattern that can result in significant morbidity and mortality in later life (see ▶ chap. 33, "childhood obesity"). that being said, engagement in athletics can result in serious injury or death, albeit very infrequently. deaths secondary to trauma are fairly self-explanatory so only a brief overview is in order. much of this chapter will focus on natural disease processes and pathologic conditions that can present as sudden death while a child or adolescent is involved in physical activity. a sports-related fatality is one in which death occurs while the participant is engaged in athletics. within this broad category, death can be directly attributed to an injury received during the activity in which case the manner of death is best certified as an "accident" provided that the trauma was received in accordance with the rules of the sport. in the case of death due to trauma inflicted flagrantly outside of the rules of the sport or if the lethal injury was intentionally inflicted, a manner certification of "homicide" may be more appropriate. in many cases, sudden death may be the result of physical stress superimposed upon a natural disease process or pathologic condition, often involving the heart. in this setting, the manner of death should be certified "natural," analogous to myocardial infarction occurring in an older person engaged in physical exertion (froede 2003) . the sequelae of repetitive blows to the head have attracted much attention in recent years (omalu et al. 2005 (omalu et al. , 2010 . while the manifestation of repeated concussions usually appears in middle age or later, it is probable that the damage begins when the brain is first jarred with resultant alteration in mental status that is characteristic of a concussion. what is certain is that concussions must be treated as a serious medical condition, and vigilance is necessary to ensure the safety of participants who are in a post-concussive state. the clinical entity known as the "second impact syndrome" (sis) can cause morbidity and mortality following head injury (bey and ostick 2009) . sis is comprised of two events: (1) a concussive head injury and (2) a second head injury within several weeks followed by cerebral edema, herniation, and death. although the incidence is arguable and is yet to be firmly established, it is thought to be a rare outcome of head injury (bey and ostick 2009) . in any event, any athlete who manifests concussive symptoms following a head injury (e.g., fatigue, confusion, headache, nausea, vomiting) should be closely observed and not be permitted to return to play for 7-14 days (bey and ostick 2009) (table 26 .1). most serious head injuries occur in the traditional contact sports. they run a spectrum that includes superficial lacerations, contusions, and abrasions to skull fractures, cerebral contusions, deep axonal injury (in the delicate white matter of the brain), and intracranial bleeding. when a human head collides with another object, skull fractures may occur. protective gear, such as helmets, minimizes this risk in many sports. in an unprotected head, skull fractures may be associated with tearing of arteries within the bones of the calvarium, including the middle meningeal artery, resulting in epidural hematomas (edh) that may evolve rapidly and compress the underlying brain. cerebral contusions may also occur at the fracture site. these types of injuries are surgical emergencies. with a more significant direct impact to the head, an open fracture may occur with resultant direct injury and extrusion of the brain. impacts to the face can result in fractures with resultant compromise of the upper airways. deceleration injuries to the head can also be devastating. when a moving or falling head makes contact with a firm surface, injuries to the brain and intracranial bleeding may occur. in contradistinction to the cerebral contusions and/or fractures directly subjacent to the site of impact of a moving object with a stationary head, deceleration injuries may be associated with contrecoup cerebral contusions. these lesions are located opposite to the site of impact of a moving head with a stationary surface. common locations for contrecoup cerebral contusions are the inferior aspects of the frontal lobes and the anterior temporal lobes. contrecoup contusions may or may not be accompanied by basilar skull fractures. rapid deceleration of the cranial contents can also result in diffuse axonal injury (dai) in the white matter and intracranial hemorrhage. subdural hematomas (sdh) secondary to venous bleeding following rapid deceleration of the head can result in a potentially lethal increase in intracranial pressure that must be aggressively managed. with a whiplash type of motion or significant hyperextension of the neck, severe injuries to the cervical spine and underlying spinal cord can occur (watanabe et al. 2010 ) resulting in paralysis, respiratory arrest, hemodynamic instability, or death. this type of injury can be seen in violent collisions between bodies, after being ejected from a moving vehicle or animal, or upon impact with the ground while the body is tumbling or rolling. violent impacts to the face, as seen in boxing, can also cause the head to snap back or rapidly rotated with laceration or dissection of the vertebral arteries and subsequent subarachnoid hemorrhage (nedeltchev and baumgartner 2005) . at autopsy, the pathologist should remove the brain, cerebellum, pons, and medulla and may choose to preserve the block in formalin to permit careful sectioning after 2 weeks. if cervical injuries are anticipated, the pathologist should employ anterior and posterior neck dissections and/or vertebral artery dissection for accurate evaluation. injuries to the ribs and internal organs with subsequent internal bleeding can occur with significant impact to the chest, back, or abdomen. while rib fractures can be debilitating and painful, they are not usually lethal unless there are associated vascular or visceral lacerations or collapse of the lung and pneumothorax. internal bleeding is most often associated with lacerations of the spleen or liver following an impact. the bleeding may occur over a matter of hours or days, so a careful history may be required to establish that the injury occurred during participation in a sport. the spleen is particularly prone to injury if enlarged due to infectious mononucleosis, so infection with epstein-barr virus should be considered if rupture occurs after relatively trivial impact. direct impacts to the abdomen can also injure the mesentery, pancreas, or gastrointestinal tract. the kidneys, being relatively protected by their retroperitoneal position and the presence of a thick fat pad, are injured less frequently. a well-established cause of death in athletes receiving a precordial impact is commotio cordis (westrol et al. 2010; geddes and roeder 2005) . death is the result of a lethal dysrhythmia related to a blunt force impact to the chest occurring at a vulnerable phase in the cardiac cycle. classically, the athlete is struck in the midchest by a projectile (e.g., a baseball) and collapses within a matter of seconds. reconstruction of the events leading up to death is required to establish this diagnosis as there may be minimal or no anatomic signs to establish chest trauma and the mechanism of death is transient disruption of impulses within the cardiac conduction system. injuries to the pelvis are unusual in contact sports. however, a significant impact to the perineum, a situation that may occur during riding events (e.g., riding cycles, motorbikes, horses), may cause pelvic fractures, injuries to the genitourinary tract, and internal bleeding. impacts to the scrotum and penis can also result in significant pain and morbidity, but they are rarely life threatening. fractures, sprains, strains, and dislocations are commonly encountered in sports, but these injuries are also rarely life threatening. while in a prolonged debilitated state during rehabilitation, however, thrombi may develop in the deep veins of the lower extremities which may embolize to the lungs resulting in sudden death. if a preexisting coagulation disorder is present, the risk for deep venous thrombosis is increased. smokers and female athletes on birth control pills may be at greater risk for this complication. fractures can lead to embolization of marrow elements, predominantly fat, throughout the body (fig. 26.1 ). in addition to the problems associated with physical obstruction of vessels by large emboli, disseminated intravascular coagulation (dic) and activation of chemical mediators can result in death (hofmann et al. 1995) . the fat embolism syndrome typically occurs 24-72 h following a long bone fracture or crush injury presenting as adult respiratory distress syndrome (ards). deep trauma to adipose tissues can also cause fat embolization. of note, fat and marrow emboli are often the result of cardiopulmonary resuscitation with associated rib and/ or sternal fractures, so clinical correlation is required. soft tissue injuries have also been associated with the development of "flesh eating" (group a streptococcus) bacterial infections, even with no breach of the integument (chang et al. 2009 ). it should be remembered that even if death is due to a natural disease process, such as sepsis or pneumonia, the manner of death should be certified as "accident" if trauma initiated the chain of events that culminated in death. sport participants are often under intense pressure to perform at a high level, and they may be encouraged to maximize performance through the use of supplements and dietary modification. this is not only true of professionals but also of amateurs and individuals as young as preadolescents. when investigating the death of an athlete, a complete dietary history, including inquiry into the use of chemicals, vitamins, herbal supplements, and performance-enhancing drugs, should be obtained and any such substances procured. anabolic steroids have been historically used to facilitate strength and speed increases, muscle hypertrophy, and decreased recovery time and to generally improve athletic performance (hartgens and kuipers 2004) . pathologic changes manifested following anabolic steroid use may be appreciated on external examination and can include testicular atrophy, male pattern alopecia, male gynecomastia, masculinization, breast size and body fat decreases, clitoral enlargement, acne, and hirsutism in females (us department of health and human services national institute on drug abuse 2001). anabolic steroid use can result in peliosis of the liver, psychiatric instability, cardiomyopathy, and death fig. 26 .1 fat and bone marrow elements may embolize from fracture sites to the lungs. this may be a cause of death, an artifact of trauma, or secondary to cardiopulmonary resuscitation (hematoxylin and eosin, h&e â 10) (hartgens and kuipers 2004) . recently, anabolic steroid use has been implicated in suicides and homicides (so-called "roid rage"). the long-term effects of other performance-enhancing substances, such as human growth hormone (hgh) and creatine, have not been well established and are not recommended for children and adolescents. testing for these "performance-enhancing substances" are commonly performed on urine and hair samples through the use of reference laboratories. stimulants and diet aids can also cause or contribute to sports-related deaths. substances containing ephedrine and ephedra alkaloids have been linked to sudden death (haller and benowitz 2000) . "energy drinks" often contain agents that can contribute to cardiac deaths and hyperthermia as can certain antihistamines (clauson et al. 2003; lópez-barbeito et al. 2005) . it goes without saying that illicit drugs, including cocaine and methamphetamine, can contribute to or directly cause the death of athletes. participants in sports requiring lean body mass are at risk for death related to dehydration or metabolic abnormalities associated with anorexia nervosa and bulimia (warren 2011; misra and klibanski 2011) . it should be stressed that these disorders afflict adolescents who are concerned about their body image as well as those participating in competitive sports. investigative history consistent with these disorders are drastic weight loss, a history of using either prescription or over-the-counter medications to facilitate urination and defecation, use of appetite suppressants, a history of vomiting after eating, and exercising incessantly. physical manifestations appreciable on external examination may include cachexia; calluses, scars, or abrasions on the hands if fingers are used to induce vomiting; dental caries or loss of tooth enamel from chronic exposure to gastric acid; and periorbital, conjunctival, or scleral petechiae from induced vomiting (department of health and human services office on women's health 2009). vitreous electrolyte analysis may shed light on deaths due to dehydration or self-imposed starvation or malnutrition, but it may not establish the cause of death in all such cases. once again, a careful investigation may be required to establish this risk factor for sudden death (see ▶ chap. 24, "starvation, malnutrition, dehydration, and fatal neglect"). cardiac disease is the leading cause of sudden death in athletes engaged in sports and strenuous activities. until proven otherwise, a cardiovascular source of death should be sought when an athlete unexpectedly collapses and dies. this category of death can be broadly divided into infection, congenital conditions (molecular and structural), coronary artery anomalies, neoplasms, and progressive organic diseases. myocarditis is an inflammatory process involving the heart characterized microscopically by an inflammatory infiltrate in the myocardial interstitium accompanied by myocyte necrosis (fig. 26.2) . in the majority of cases, the inflammation is due to a viral infection (e.g., coxsackie virus and adenovirus), and a lymphocytic infiltrate will predominate. clues to the diagnosis include a recent viral illness and a "floppy" heart upon gross examination. although a viral etiology can be demonstrated in some cases through laboratory studies, in other cases the infectious agent will not be isolated. other myocarditides are caused by bacteria, fungi, parasites, or autoimmune processes. depending on the etiology of the process, the inflammatory infiltrate may consist of giant cells, eosinophils, histiocytes, or neutrophils. histologic sections may require special stains (e.g., brown and hopps, silver, or periodic acid-schiff stains) in order to better delineate microorganisms. sarcoidosis, a granulomatous inflammation of the heart, may be the result of a postinfectious inflammatory response or of an autoimmune process, the etiology of which remains unclear. special stains to rule out tuberculosis and fungi should be employed to support this diagnosis. congenital conditions may manifest themselves at a structural, cellular, or molecular level. there are a litany of metabolic diseases that may infect the heart, including pompe disease and other storage disorders. these are beyond the scope of this chapter and will not be discussed in further detail, other than to say that they may be a cause of sudden death in childhood. many of these diseases are symptomatic early in life (see ▶ chap. 31, "cardiac channelopathies and the molecular autopsy," ▶ chap. 32, "other pediatric cardiac conditions," and ▶ chap. 34, "pediatric metabolic diseases"). at the molecular level, two major considerations are long qt syndrome and brugada syndrome (goldenberg et al. 2008; escárcega et al. 2009 ). these cardiac ion channelopathies may result in sudden, unexpected death in apparently healthy individuals. there is an association with death during swimming with long qt syndrome (choi et al. 2004) , which may be diagnosed by evaluation at reference laboratories if it is suspected and appropriate samples are obtained. these diagnoses can be made by retrospective analysis of electrocardiograms in some cases; however, in many young people, this antemortem study has never been performed. these conditions cannot be diagnosed at the gross or microscopic level as they are rhythm disturbances. hypertrophic cardiomyopathy (formerly asymmetric septal hypertrophy, idiopathic hypertrophic subaortic stenosis) can be diagnosed grossly and microscopically. in classic cases, the interventricular septum will be markedly thickened when compared to the left ventricular free wall. in other cases, the left ventricle may show concentric hypertrophy; the right ventricle may also be thickened. often, fibroelastosis of the endocardium below the aortic valve is seen as a "jet lesion" (fig. 26.3) . microscopically, myocyte disarray with intervening fibrosis is the characteristic histologic finding (fig. 26.4) . this finding may be focal, and multiple microscopic sections of the heart with trichrome staining may assist in the diagnosis. this disease is caused by a protein abnormality in the heart resulting from a mutation in the genes encoding for the sarcomeric proteins (e.g., myosin heavy and light chains, myosin-binding protein c, troponins i and t, and tropomyosin) (harris et al. 2011) . as hypertrophic cardiomyopathy is an autosomal-dominant inheritable condition in approximately half of the victims, this diagnosis has implications for surviving family members (cirino and ho 2008) . marfan syndrome affects multiple sites in the body. the cardiovascular manifestation of this condition is cystic medial necrosis of the aorta. this may result in aortic dissection with rupture into the pleural spaces or pericardial sac with cardiac tamponade or dissection of the coronary arteries. marfan syndrome should be suspected in the sudden collapse and death of tall athletes with long hands and feet (arachnodactyly), a desirable physique for basketball and volleyball players. this disease is caused by a mutation in the fibrillin-1 gene, the product of which is an extracellular matrix glycoprotein that maintains the structural integrity of connective tissues (robinson et al. 2006) . arrhythmogenic right ventricular cardiomyopathy (arvc) may be inherited in an autosomal-dominant pattern (azaouagh et al. 2011) . it is characterized by progressive replacement of the myocardium of the right ventricle by adipose tissue fig. 26.3 in hypertrophic cardiomyopathy, a fibroelastotic "jet lesion" is often found on the endocardium subjacent to the aortic valve and fibrosis. occasionally, there are a few scattered inflammatory cells. in advanced cases, the left ventricle may also be involved. this condition may be undiagnosed as the findings are subtle in the early stages, and the right ventricle is often undersampled for histologic analysis. this diagnosis can, at times, be difficult to make, and cardiovascular pathology consultation may prove beneficial. structural defects resulting in sudden death may or may not be grossly apparent. valvular anomalies, septal defects, and transposition of the great vessels can be readily identified at autopsy. deaths due to structural anomalies of the coronary arteries may be more subtle. consultation with a cardiovascular pathologist may be helpful in identifying coronary arterial atresia, intramyocardial tunneling, or acute origin from the sinus of valsalva. these experts may also assist in identifying problems with the cardiac conduction system. these may be either aberrant neural pathways or stenoses of the arteries supplying the atrioventricular (av) or sinoatrial (sa) nodes. a microscopic tumor of the av node can also result in sudden death (fig. 26.5 ). structural anomalies of other blood vessels may also lead to sudden death. arteriovenous malformations, particularly within the central nervous system, may rupture with catastrophic results. "berry" aneurysms of the cerebral vasculature may enlarge over time, and intense physical exertion with associated elevation of blood pressure (e.g., weightlifting) may precipitate bleeding. aneurysms and pseudoaneurysms of large arteries cause massive internal hemorrhage. primary cardiac neoplasms are rare but they can lead to death. tumors that affect the heart include atrial myxoma, fibroma, and rhabdomyoma, the latter associated with tuberous sclerosis. the heart may also be affected by lymphomas, angiosarcomas, and metastatic disease. the most common cancers metastatic to the heart are lung, breast, melanoma, and leukemia/lymphoma. adolescents are not immune to cardiovascular diseases that kill older individuals. especially in the setting of familial hypercholesterolemia, atherosclerotic coronary artery disease may develop in the mid-teen years. hypertension may also result in myocardial hypertrophy and lethal dysrhythmia; however, this must fig. 26.4 myocyte disarray with intervening fibrosis is characteristic of hypertrophic cardiomyopathy. it may be focal, and multiple heart sections should be examined if there is no apparent cause of death in an athlete following autopsy (hematoxylin and eosin, h&e â 100) be distinguished from hypertrophic cardiomyopathy, discussed above. whereas hypertrophic cardiomyopathy commonly affects the septum on gross inspection and is associated with myocyte disarray, hypertension generally results in concentric thickening of the left ventricular chamber and enlarged, hypertrophic myocytes with hyperchromatic "box car" nuclei at the microscopic level. lastly, morbid obesity has been associated with sudden death (see ▶ chap. 33, "childhood obesity"). when faced with an unanticipated subarachnoid hemorrhage, the pathologist should remove the brain themselves with frequent photographic documentation of the process in order to capture occult lesions prior to onset of any removal artifact(s). once removed, the brain, cerebellum, pons, and medulla should be copiously rinsed with water to remove adherent blood and clot. in lieu of water, hydrogen peroxide may be used to facilitate the lysis of adherent blood from the delicate vasculature so that it can be better examined. care must be taken to avoid destruction of subtle vascular malformations and aneurysmal sacs. sickle-cell disease may be diagnosed in childhood, and it can afflict participants in athletics and other strenuous activities. if the diagnosis of sickle-cell disease is known, recognition and treatment of an impending crisis can avert death. many people with sickle-cell trait, however, are unaware of their condition. when subjected to intense physical exertion, high temperatures, and a component of dehydration, a crisis may ensue and death may rapidly follow (scheinin and wetli 2009; manci et al. 2003) . a recent viral illness could also be an exacerbating factor. this condition should be considered when an athlete of african or mediterranean descent complains of joint pain, chest pain, fatigue, and weakness prior to collapse. it is often misdiagnosed as a heat-related illness. the diagnosis is made at the microscopic level wherein virtually all organs will be congested by sickled erythrocytes (figs. 26.6 and 26.7) . at the gross level, persons with sickle-cell disease may have fibrotic, atrophic spleens, whereas those with sickle-cell trait may have enlarged spleens, congested with sickled erythrocytes. the diagnosis can be confirmed with hemoglobin electrophoresis (blood best procured in a tube with anticoagulant/edta) and correlated with information obtained regarding prevalence and distribution of this disease within the family. asthma is a common disease among children and teens, and acute attacks may be precipitated by physical activity. in order to certify death due to asthma, the circumstances of death need to reflect a respiratory crisis. in many cases, an inhaler and/or a nebulizer will be found near the victim or with their personal belongings. grossly, the lungs will be hyperinflated, often touching over the heart in the midline, with prominent mucus plugging of the airways. the microscopic findings of chronic asthma in the bronchioles (thickening of the basement membranes, smooth muscle hypertrophy, and mucus gland hyperplasia) will be accompanied by an eosinophil-rich inflammatory infiltrate that extends into the luminal mucus plugs (figs. 26.8 and 26.9) . a history of asthma should not be a default cause of death in these cases without the circumstances and scene findings supporting a respiratory catastrophe. epilepsy, like asthma, may be a cause of death, however, the circumstances should support the diagnosis. in the absence of a witnessed seizure, other causes of death must be excluded prior to attributing death to epilepsy. further, intracranial trauma must be excluded as a cause of the seizure. some epileptics will die suddenly and unexpectedly in the absence of a seizure (sudden unexpected death in an epileptic person or sudep), but this does not typically occur while the victim is engaged in sports. the autopsy may or, more commonly, may not identify the anatomic correlate of the seizure focus in the brain. spontaneous pneumothorax may occur during sports presumably due to increased shear forces at the apex of the lung (abolnik et al. 1993) . it may cause death if it progresses to a tension pneumothorax, a condition that results in both hypoxia and mechanical alterations of the cardiovascular system. this is a diagnosis that may be missed if it is not suspected. diabetes mellitus can kill children and adolescents engaged in sports. activities which entail dietary restrictions may predispose those with the disease to ketoacidosis. young people more often have type i diabetes and may be insulin dependent. as teens have a tendency toward denial and risk-taking behavior, they may not be fully compliant with their treatment regimens and therefore be prone to significant blood glucose fluxes. the gross findings at autopsy will be minimal in these cases. microscopically, the islets of langerhans in the pancreas may be infiltrated by lymphocytes ("insulitis"), or they may be diminished in number. urine screens for glucose and ketones may be useful, but postmortem blood analysis is unreliable. the best sample for diagnosing diabetes mellitus and ketoacidosis postmortem is vitreous humor. the presence of ketones and significantly elevated glucose (> 500 mg/dl) in the vitreous humor is diagnostic of this condition (chansky et al. 2009 ). vitreous glucose levels drop significantly after death, however, so a lower ocular glucose level does not exclude hyperglycemia. further, hypoglycemia cannot be diagnosed postmortem due to the aforementioned postmortem change. in the evaluation of a nontraumatic death occurring during sports, analysis of vitreous glucose, ketones, and electrolyte levels is recommended in all cases. prior to the examination, chest radiographs including lateral and seated views may illustrate free pleural air and displacement of the heart. at autopsy, care should be taken to reflect the skin and soft tissues of the chest without breaching the intercostal tissues or entering the chest cavities. a pocket which should be filled with water can be created using the reflecting chest tissues. the intercostal tissues can then be pierced below the water level to examine whether air bubbles emerge. the findings of chronic asthma (basement membrane thickening, smooth muscle hypertrophy, mucus gland hyperplasia) with an intense eosinophilic inflammatory response in a person who died during an asthma attack during exercise (hematoxylin and eosin, h&e â 20) second impact syndrome children with lethal streptococcal fasciitis after a minor contusion injury hyperglycemic emergencies in athletes spectrum and frequency of cardiac channel defects in swimming-triggered arrhythmia syndromes familial hypertrophic cardiomyopathy overview safety issues associated with commercially available energy drinks department of health and human services office on women's health (2009) fact sheet on bulimia nervosa the brugada syndrome evolution of our knowledge of sudden death due to commotio cordis long qt syndrome effects of androgenic-anabolic steroids in athletes adverse cardiovascular and central nervous system events associated with dietary supplements containing ephedra alkaloids in the thick of it: hcm-causing mutations in myosin binding proteins of the thick filament pathophysiology of fat embolisms in orthopedics and traumatology diphenhydramine overdose and brugada sign causes of death in sickle cell disease: an autopsy study bone metabolism in adolescents with anorexia nervosa traumatic cervical artery dissection chronic traumatic encephalopathy in a national football league player chronic traumatic encephalopathy (cte) in a national football league player: case report and emerging medicolegal practice questions the molecular genetics of marfan syndrome and related disorders sudden death and sickle cell trait: medicolegal considerations and implications anabolic steroid abuse. nih publication number endocrine manifestations of eating disorders upper cervical spine injuries: agespecific clinical features causes of sudden cardiac arrest in young athletes key: cord-268816-nth3o6ot authors: roy, satyaki; ghosh, preetam title: factors affecting covid-19 infected and death rates inform lockdown-related policymaking date: 2020-10-23 journal: plos one doi: 10.1371/journal.pone.0241165 sha: doc_id: 268816 cord_uid: nth3o6ot background: after claiming nearly five hundred thousand lives globally, the covid-19 pandemic is showing no signs of slowing down. while the uk, usa, brazil and parts of asia are bracing themselves for the second wave—or the extension of the first wave—it is imperative to identify the primary social, economic, environmental, demographic, ethnic, cultural and health factors contributing towards covid-19 infection and mortality numbers to facilitate mitigation and control measures. methods: we process several open-access datasets on us states to create an integrated dataset of potential factors leading to the pandemic spread. we then apply several supervised machine learning approaches to reach a consensus as well as rank the key factors. we carry out regression analysis to pinpoint the key pre-lockdown factors that affect post-lockdown infection and mortality, informing future lockdown-related policy making. findings: population density, testing numbers and airport traffic emerge as the most discriminatory factors, followed by higher age groups (above 40 and specifically 60+). post-lockdown infected and death rates are highly influenced by their pre-lockdown counterparts, followed by population density and airport traffic. while healthcare index seems uncorrelated with mortality rate, principal component analysis on the key features show two groups: states (1) forming early epicenters and (2) experiencing strong second wave or peaking late in rate of infection and death. finally, a small case study on new york city shows that days-to-peak for infection of neighboring boroughs correlate better with inter-zone mobility than the inter-zone distance. interpretation: states forming the early hotspots are regions with high airport or road traffic resulting in human interaction. us states with high population density and testing tend to exhibit consistently high infected and death numbers. mortality rate seems to be driven by individual physiology, preexisting condition, age etc., rather than gender, healthcare facility or ethnic predisposition. finally, policymaking on the timing of lockdowns should primarily consider the pre-lockdown infected numbers along with population density and airport traffic. we process several open-access datasets on us states to create an integrated dataset of potential factors leading to the pandemic spread. we then apply several supervised machine learning approaches to reach a consensus as well as rank the key factors. we carry out regression analysis to pinpoint the key pre-lockdown factors that affect post-lockdown infection and mortality, informing future lockdown-related policy making. population density, testing numbers and airport traffic emerge as the most discriminatory factors, followed by higher age groups (above 40 and specifically 60+). post-lockdown infected and death rates are highly influenced by their pre-lockdown counterparts, followed by population density and airport traffic. while healthcare index seems uncorrelated with mortality rate, principal component analysis on the key features show two groups: states (1) forming early epicenters and (2) experiencing strong second wave or peaking late in rate of infection and death. finally, a small case study on new york city shows that days-to-peak for infection of neighboring boroughs correlate better with inter-zone mobility than the interzone distance. states forming the early hotspots are regions with high airport or road traffic resulting in human interaction. us states with high population density and testing tend to exhibit during pre-and post-covid periods to show that the odds of mortality of whites and blacks are statistically equivalent [23] . myers et al. analyzed the covid-19 positive patients in california to investigate its prognosis in the higher age groups and individuals with preexisting conditions [24] . zoabi et al. applied ml on 51,831 covid-19 positive patients to understand the effect of gender, age and contact to show that close social interaction is a strong feature for covid-19 transmissibility [25] . khan et al. applied regression tree, cluster analysis and principal component analysis on worldometer infection count data to study the variability and effect of testing in prediction of confirmed cases [26] . finally, pan et al. studied the effects of the myriad public health interventions (such as lockdown, traffic restriction, social distancing, home quarantine, centralized quarantine, etc.) on 32,583 covid-19 patients, with respect to their age, sex, residential location, occupation, and severity [27] . contributions: while it is evident that factors such as gender, race, age, testing, social contact and distancing have been analyzed in a piecemeal manner, there is no comprehensive study that combines the demographic, economic, and epidemiological, ethnic and health indicators for infection and mortality from covid-19. to address this gap, we carry out a machine learning-based analysis with the following three objectives. 1. we curate a dataset of diverse features (detailed in sec. 2.1) from 50 states of usa. this dataset is somewhat unique, since, in addition to the above features, it includes factors such as airport traffic, homeless and variations in lockdown dates. also, note that the lockdown was enforced on the us states at around the same time, when each state was at a different stage of the covid-19 infection cycle. 2. we analyze the variation of covid-19 infection spread and mortality rates using a set of standard supervised ml methods. we rank the key discriminatory factors based on the importance score calculated from randomized decision trees. we combine the findings to identify the most vulnerable age groups and us states. we also show the effect of testing and lockdowns on the infection spread dynamics. 3. we utilize multiple linear regression to gauge the extent to which the key pre-lockdown factors affect the post-lockdown infected and death numbers. this study assigns weights to features and drive mitigation efforts and large scale policymaking. our data-driven experiments using supervised methods demonstrate that population density, testing [28] and airport traffic [29] are key factors contributing to infection and mortality rates. furthermore, high age group (40 and beyond, and specifically exceeding 60) population are more vulnerable. principal component analysis on the key features show two groups: highly affected us states (1) forming early epicenters and (2) showing consistent or newly peaking rate of infection and death. multiple regression analysis shows that the postlockdown numbers are most influenced by the pre-lockdown infected and death numbers followed by population density and airport activity, while overall healthcare index of a state does not seem to play a part in the overall death count. similarly, the race of individuals did not play any significant role in the infection or mortality numbers. despite increased testing rates, the fraction of individuals tested positive drop approximately three weeks into the lockdown, suggesting that the social distance measures has had an impact on curbing spread. finally, we discuss the role of mobility and distance in infection spread. in the absence of large-scale inter-state mobility data, our case study on the boroughs of new york city show that peaks of infection correlate better with inter-zone mobility than the interzone distance. all the experiments have been performed using scikit-learn, which is a popular machine learning library in python [30] . let us discuss the details of the two datasets used in this work. 2.1.1 data from us states. our dataset has been carefully curated from several open sources to examine the possible factors that may affect the covid-19 related infection and death numbers in the 50 states of usa. the individual open-access data sources as well as the integrated (curated) dataset has been shared on github (https://github.com/satunr/covid-19/tree/master/us-covid-dataset). below, we discuss a summary of the features and output labels of the integrated dataset. • gross domestic product (in terms of million us dollars) for us states [31] (filename: source/ gdp.xlsx, feature name: gdp). • distance from one state to another (is not measured in miles but the euclidean distance between their latitude-longitude coordinates between the pair of states [32] ) (filename: source/data_distance.xlsx, feature name: d(state1, state2)). • gender feature(s) is a fraction of total population representing the male and female individuals [33] (filename: source/data_gender.csv, feature name: male, female). • ethnicity feature(s) are the fraction of total population representing white, black, hispanic and asian individuals (we leave out other smaller ethnic groups) [34] (filename: source/ data_ethnic.csv, feature name: white, black, hispanic and asian). • healthcare index is measured by agency for healthcare research and quality (ahrq) on the basis of (1) type of care (like preventive, chronic), (2) setting of care (like nursing homes, hospitals), and (3) clinical areas (like care for patients with cancer, diabetes) [35] (filename: source/data_health.xlsx, feature name: health). • homeless feature is the number of homeless individuals of a state [36] (filename: source/ data_homeless.xlsx, feature name: homeless). the normalized homeless population of each state is the ratio between its homeless and total population. • total cases (and deaths) of covid-19 is the number of individuals tested positive and dead [37] (filename: source/data_covid_total.xlsx, feature name: total cases and total death). the normalized infected/death is the ratio between the infected/death count to total population of the given state. • infected score and death score is obtained by rounding normalized total cases and deaths to discrete value between 0-6 (feature name: infected score, death score). • death-to-infected is a feature measuring impact of death in terms of the difference between death and infected scores. it is calculated as max(death score -infected score, 0). • lockdown type is a feature capturing the type of lockdown (shelter in place: 1 and stay at home: 2) in a given state [37, 38] (filename: source/data_lockdown.csv, feature name: lockdown). • day of lockdown captures the difference in days between 1st january 2020 to the date of imposition of lockdown in a region [39] (filename: source/data_lockdown.csv, feature name: day lockdown). • population density is the ratio between the population and area of a region [40] (filename: source/data_population.csv, feature name: population, area, population density). • traffic/activity of airport measures the passenger traffic (also normalized by the total traffic across all the states of usa [41] (filename: source/data_airport.xlsx, feature name: busy airport score, normalized busy airport). • age groups (0-80+) in brackets of 4 year (also normalized by total population) [40] (filename: source/data_age.xlsx, feature name: age_to_, norm_to_, e.g. age4to8); we later group them in brackets of 20 for the purposes of analysis. • peak infected (and peak death) measures the duration between first date of infection and date of daily infected (and death) peaks [40] (feature name: peak infected, peak death). • testing measures the number of individuals tested for covid-19 (total number, before and after imposition of lockdown) [38, 42] (filename: source/data_testing.xlsx, feature name: testing, pre-lockdown testing, post-lockdown testing). • pre-and post-infected and death count measures the number of individuals infected and dead before and after lockdown dates (feature name: testing, pre-infected count, pre-death count, post-infected count, post-death count). • days between first infected and lockdown date (feature name: first-inf-lockdown). the above features, their abbreviations and summary statistics (i.e., mean, standard deviation, maximum and minimum) are enlisted in table 1 . note that, for gender and ethnicity we report the fraction of the total state population falling in each category. the new york city (nyc) datasets (https://github.com/ satunr/covid-19/blob/master/us-covid-dataset/nyc_dist_mob.xlsx) show the inter-borough distance and mobility as well as covid-19 infected (https://github.com/satunr/covid-19/blob/master/us-covid-dataset/nyc-inf.xlsx) and death counts (https://github.com/ satunr/covid-19/blob/master/us-covid-dataset/nyc-dth.xlsx) for the 5 boroughs of nyc, namely, manhattan, queens, brooklyn, bronx and staten island. table 1 . summary of features and their statistics (i.e., mean, standard deviation (dev.), maximum (max.) and minimum (min.)). the features in the order shown under "feature name" are: gdp, inter-state distance based on lat-long coordinates, gender, ethnicity, quality of health care facility, number of homeless people, total infected and death, population density, airport passenger traffic, age group, days for infection and death to peak, number of people tested for covid-19, days elapsed between first reported infection and the imposition of lockdown measures at a given state. factors affecting covid-19 infected and death rates inform lockdown-related policymaking • mobility data (based on traffic volume counts collected by dot for new york metropolitan transportation council (nymtc) [43] ) shows the number of trips from one borough to another. • covid-19 data shows the number of covid-19 infected and death counts for each borough [44] . we acquire the daily infected and testing counts across us from january-july, 2020 [45] . this dataset is part of the covid tracking project that collect covid-19 statistics on the numbers on tests, cases, hospitalizations, and patient outcomes from every us state and territory by voluntary public participation. we use the scikit-learn library kbinsdiscretizer to group the continuous feature values into discrete values by creating balanced clusters using the quantile strategy [46] . 2.1.5 supervised learning methods. supervised machine learning algorithms learn a function that maps the input training data (i.e., features) to some output labels [47] . in this work, we consider the following supervised learning techniques. (refer [48] [49] [50] [51] [52] [53] [54] for the details on these ml approaches.) • support vector machine (svm) is used for classification and regression problems that maps the inputs to high-dimensional feature spaces. svm operates on hyperplanes-decision boundaries that help classify the data points. the objective is to maximize the separation between the data points and the hyperplane. svm is memory efficient and effective for datasets with fewer data samples [55] . • stochastic gradient descent (sgd) is an iterative approach that fits the data to an objective function [56] . as the name suggests, it is a stochastic variant of the popular gradient descent (gd) optimization model [57] . in gd, the optimizer starts at a random point in the search space and reaches the lowest point of the function by traversing along the slope. unlike gd that requires calculating the partial derivative for each feature at each data point, sgd achieves computational efficiency by computing derivatives on randomly chosen data points. • nearest centroid (nc) is a simple classification model that represents each class by the centroid of its members. subsequently, it assigns each data point to the cluster whose centroid is the closest to it. nc is particularly effective for non-convex classes and does not suffer from any additional dependencies on model parameters [58] . • decision trees (dts) are a classification and regression technique that assigns target labels based on decision rules inferred from data features [59] . dt maintains the decision rules using a tree. a data point is assigned to a class by repeatedly comparing the tree root with the data point value to branch off to a new root. • gaussian naive bayes (nb) are a class of fast, probabilistic learning techniques that apply the bayes' theorem to assign labels to the data points [60] . while supervised ml approaches generally yield reliable prediction accuracy, they often suffer from overfitting or convergence issues [47, 61] . each of the above approaches has its own advantages and disadvantages. svm works well when the underlying distribution of the data is not known. however, it is prone to overfitting when the number of features is much greater than the number of samples. sgd needs low convergence time for a large dataset, but it may require to fit a number of hyperparameters. conversely, dt involves almost no hyperparameters, but often entails slightly higher training time. unlike dt, nb requires less training time but works on the implicit assumption that all the attributes are mutually independent. finally, nc is a fast method but is not robust to outliers or missing data. in the context of our work, we intuit that the discriminatory feature(s) will yield a high accuracy irrespective of the underlying supervised ml algorithm used. • accuracy function measures the fraction of matches between the predicted and actual labels in a multi-label classification, i.e., the ratio of correctly predicted observations to the total observations. it can be calculated as: in the above equation, tp, tn, fp, fn denote true positive, true negative, false positive and false negative, respectively. • extra trees classifier is an estimator that fits randomized decision trees (called extra-trees) on data samples. the memory and computation overhead of this approach can be controlled by regulating the size of the extra trees. the nodes in the tree are split into sub-trees resulting in high accuracy (i.e., drop in impurity). thus, feature importance is measured as total reduction in impurity affected by that feature [62] . • multiple regression (mr) is a statistical tool to capture the linear relationship between the independent and the dependent variables x and y of a function y = g(x). in our context, mr generates a linear relationshipŷ where b fi is the coefficient that captures the contribution of feature f i towards the dependent variable y, while β 0 and � are the intercept and error terms, respectively. given any pair of vectors v andv (jvj ¼ jvj ¼ n), we apply the following standard statistical operations: • mean centering subtracts the mean μ from each element of a vector v, i.e., v 0 = v − μ(v). this standardization adjusts the scales of magnitude by making the new mean 0 and helps compare data from varied sources or having different datatypes. • mean squared error (mse) is calculated as 1 • pearson correlation coefficient (pcc) between v andv measures the strength of a linear association between two variables, where the value pcc = 1 is a perfect positive correlation and −1 is perfect negative correlation. • positivity rate ρ is the ratio between the number of individuals tested positive to the number of tests performed daily [63] . this section is classified into the following three subsections: (1) and (2) table 2 . unless otherwise stated, the feature set comprises gdp, gender, ethnicity, health care, homeless, lockdown type, population density, airport activity, and age groups, whereas the output labels consist of infected and death scores on a scale of 0-6. we apply supervised machine learning (ml) approaches to identify the key factors affecting covid-19 infected and death counts. for each supervised ml technique, we perform an exhaustive search of all possible combinations of any 5 features and identify the feature subset (s) with the highest accuracy (discussed in sec. 2.2) as the most important features. fig 1 shows the scores for different supervised methods. although proposing a machine learning algorithm that works best on covid-19 data is not the purpose of this study, it is worth reporting that decision tree classifier (dt) slightly outperforms the other algorithms for both cases of infected and death scores. we create a pool of all features participating in at least one combination for output labels of infected and death scores. fig 2 shows a heatmap of the importance i for all such features against each supervised technique. for infected score as output label (top figure), homeless (home), population density (pd), airport activity (air), testing (test), white (wht), etc. have the highest i. for death score as output label, pd, air, test and age groups above 50 years (age50_54 and age80_84) exhibit the highest importance. we apply the extra trees classifier to generate the impurity-based rank for the features (discussed in sec. 2.2). fig 3a shows the top 5 important features corresponding to the infected and death scores, respectively. it is interesting that for both cases, the same set of features, namely, population density, days to peak, airport traffic, testing and high age groups, are identified. also note that the same features exhibit a very high participation in the 5-feature combinations shown in fig 2. next, as a validation exercise, we apply dimension reduction on the factors affecting covid-19 infected and death rates inform lockdown-related policymaking table 1 we discussed in sec. 2.1, that our initial dataset groups ages into brackets of 4 (0-4, 4-8, and so on). our results from supervised learning (sec. 3.1) and extra trees (sec. 3.2) suggest that high age groups are important factors affecting the infected and death scores of covid-19. to understand the effect of covid-19 infected and death scores on low and high age groups, we create two feature sets for population of age �40 and >40. fig 4a shows that for both cases of infected and death, the accuracy (acc) is higher for higher age groups. we explore this by repeating the above experiment, this time, with a feature set of groups 40-60 and >60. fig 4b depicts that acc for age group 60+ is marginally higher, suggesting that the elderly are amongst the most vulnerable, however the difference in mortality rates in this case was not statistically significant. we carry out a study to identify the pre-lockdown factors of any region (us states in our case) that contribute to the overall post-lockdown infection and death numbers. we partition the total infected and death numbers for each state into pre-and post-lockdown infected and death counts. we then create a feature set consisting of population density, airport business, pre-lockdown infected, pre-lockdown death, days between first infected to lockdown and age group above 80. the features represent the set of observable factors for the administrative and health bodies and were already shown to possess high feature significance in the previous factors affecting covid-19 infected and death rates inform lockdown-related policymaking section. the output labels are the post-lockdown infected and post-lockdown death numbers. we perform the following experiments: 3.4.1 identification of discriminating features. we carry out a simple preprocessing step to convert each feature entry to percentile (with respect to the feature vector) and rank the us states in the decreasing order of infected and death scores (fig 5) . we calculate the weighted average percentile of features for the top and bottom k = 10 us states using the formula where p(f i ) and ρ(f i ) are the percentile and rank of the i th feature value, while r is the number of us states (equal to maximum rank). we intuit that the feature exhibiting the maximum difference in weighted average percentile for top and bottom k covid-19 affected us states are the discriminating ones. fig 6a shows the percentile difference suggesting that airport and population density are the most significant, while days between first infected to lockdown and age group of 80+ are the least discriminating. we apply multiple regression (mr) (see sec. 2.2) to measure the weightage of each of the above features in the observed post-lockdown infected (post_inf) and post-death numbers (post_dth). we eliminate the days between first infected to lockdown (fst-lock) and age group 80+, which are the least discriminating features from the percentile analysis (see fig 6a) . as a prerequisite for mr, we need to eliminate features that are mutually correlated. fig 6b shows that pre-inf and pre-dth are highly correlated, and hence we run two separate batches of mr: (1) population density, airport business, pre-lockdown infected and (2) population density, airport business, pre-lockdown death. we explore the effect of testing and lockdown on infection spread. we utilize positivity ratio ρ (defined in sec. 2.3) to gauge how widespread the infection spread is [63] . we acquire the daily infected and testing count in us (see sec. 2.1.3) and plot the mean daily ρ across all states over the period of february-july 2020. fig 7a shows that the testing increased over a period time, while the positivity ratio dropped post lockdown (shown in red dotted line). while, testing (and, by extension, positivity ratio) is an effective epidemiological indicator, it cannot curb infection spread by itself. however, fig 7a shows that the ρ has dropped approximately three weeks into the lockdown, suggesting that the latter had an impact on curbing spread by minimizing social contact. table 3 shows that pre-infected and pre-death with high coefficients contribute highly towards factors affecting covid-19 infected and death rates inform lockdown-related policymaking the post-lockdown infected and death numbers, followed by population density and airport traffic. this finding is further supported by the p values reported for the respective features. note that the r 2 scores for all the four cases are >0.8, suggesting that the output features capture a high proportion of the variance in the input features. overall, pre-infected count has higher coefficient and r 2 score and emerges as a marginally better discriminating feature of post-lockdown effects than the pre-death count. factors affecting covid-19 infected and death rates inform lockdown-related policymaking in sec. 3.2, we perform pca on the feature set of the key factors to show that states with high infection and death numbers stand out of the cluster of other states. these states include some erstwhile hotspots forming group 1 (such as new york city, new jersey, massachusetts, connecticut, rhode island) as well as states experiencing a steady infection and death count and also a strong second wave forming group 2 (such as texas, washington, california, georgia, arkansas, utah and colorado) (fig 3b) . in the pca analysis, pc1 and pc2 account for 41% and 21% variance, respectively. we explore how each feature influences each component to show that pc1 is driven by factors such as airport activity and high age groups (70 and beyond), while pc2 is dominated by population density, airport, age (80+) and testing. notice in fig 3b, though both groups 1 and 2 exhibit high spread across pc1, group 2 forms a slightly denser cluster than group 1, implying that it exhibits an even mix of pc1 and pc2 features. we intuit that the early peaking in infection in group 1 states is due to high road and airport mobility leading to high mixing and infection spread that is manifested in the elderly population. group 2 shows enduring infection spread due to high population density and testing, in addition to airport activity and populations with higher age group. we study how demographics affect covid-19 numbers to show that states with higher age groups (particularly 60 and beyond) numbers are the most vulnerable. finally, we split the infected and death numbers on the pre-and post-lockdown epochs and apply multiple linear regression to show that pre-lockdown infected and death, population density and airport contribute highly to the post-lockdown numbers. this analysis can be particularly effective in pinpointing the most vulnerable states and recommending lockdown policies on starting dates and duration to curb pandemic spread. note that our present study pertains to the identification of the discriminatory features with respect to the date of lockdown. there exists several unanswered questions regarding the impact of length, scheduling strategies, lockdown types and extent of lockdowns on pandemic spread that need to be answered. such an analysis requires a richer feature set as well as a sound understanding of the dynamics of infection spread in terms of healthcare, distance, mobility, etc. as a preliminary study, we first explore whether there is any relationship between the health care index (health) of a us state and the number of transitions from infected to death (dth/inf) in this state. the pearson's correlation coefficient (see sec. 2.3) between the two factors is 0.11, suggesting that the overall mortality numbers is largely unrelated to the healthcare facility and may solely depend on the infected individual's attributes, such as age, comorbidities, infection severity, etc. second, since proximity plays a role in infection spread, neighboring regions should peak at nearly the same time. we posit that mobility may play an even greater role in the spread, than a static measure like distance between a pair of regions. in the absence of a inter-state mobility dataset, we create two feature sets for the nyc boroughs dataset (see sec. 2.1): (1) inter-borough distance and (2) inter-borough mobility. each borough b has a distance and mobility vector d b = {d b1 , d b2 � � �} and m b = {m b1 , m b2 � � �} where d bi and m bi are the probabilistic measure of distance and mobility between a borough b with borough i. we calculate the correlation of the mean squared error (see sec. 2.3) of the distance/mobility vectors of any pair of boroughs b 1 and b 2 against the absolute difference of their peak to infected or peak-to-death features. fig 7b suggests that mobility yields a higher correlation (0.44) than distance (0.22) suggesting that mobility is a slightly more informative feature to analyze infection spread. we are currently working towards broadening the scope of this study in different directions. first, this work attempted to apply ml analysis on a wide range of features, making the the states of united states the ideal choice, specifically from the standpoint of data availability. in future we would like to extend this work by running these experiments on epidemiological, demographic and economic data of different countries. it would be interesting to report the variation in the discriminatory features identified for different countries. second, we identify population density, testing, airport activity and pre-lockdown infected count as key features driving the post-lockdown infection and death numbers. we plan to utilize these findings to design policies on the timing, duration and stringency of lockdown for future pandemics. third, all the input features discussed in this work are static or time invariant. it is imperative to analyze the evolution of dynamic features (such as gdp and unemployment rates) from the pre-covid to the post-covid timelines to uncover the long-term economic effects of covid-19. machine learning is emerging as an important tool to predict the dynamics of spread of covid-19 and identify the key factors driving infection and mortality rates. while existing works study the effects of gender, race, age, testing, social contact and distancing separately, we present an unified analysis of the demographic, economic, and epidemiological, ethnic and health indicators for infection and mortality rates from covid-19. we curate a dataset of us states comprising features (from varying sources discussed in sec. 2.1) that may potentially impact infection and death rates of covid-19. we run several supervised machine learning techniques to identify and rank the key factors correlating with infection and fatality counts. population density, testing rate, airport traffic, high age groups emerge as significant, while ethnicity, gender, healthcare index, homeless and gdp have little or no impact on pandemic spread and mortality. coronavirus: what have been the worst pandemics and epidemics in history coronavirus world map: which countries have the most cases and deaths epidemiology, causes, clinical manifestation and diagnosis, prevention and control of coronavirus disease (covid-19) during the early outbreak period: a scoping review covid-2019 and world economy. covid-2019 and world economy covid-induced economic uncertainty is this the second wave of covid-19 in the u.s.? or are we still in the first? how will country-based mitigation measures influence the course of the covid-19 epidemic? the lancet in beijing it looked like coronavirus was gone. now we're living with a second wave daily covid-19 cases in india continue to soar, japan's tokyo in fears of 2nd wave of infections a fiasco in the making? as the coronavirus pandemic takes hold, we are making decisions without reliable data it's time to get real about the misleading data analysis of factors associated with disease outcomes in hospitalized patients with 2019 novel coronavirus disease factors affecting covid-19 transmission the origin, transmission and clinical therapies on coronavirus disease 2019 (covid-19) outbreak-an update on the status prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal artificial intelligence and machine learning to fight covid-19 machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: covid-19 case study coronavirus (covid-19) classification using ct images by machine learning methods wrong but useful-what covid-19 epidemiologic models can and cannot tell us prediction of epidemic trends in covid-19 with logistic model and machine learning technics modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions the association of race and covid-19 mortality. eclinicalmedicine, 100455 characteristics of hospitalized adults with covid-19 in an integrated health care system in california covid-19 diagnosis prediction by symptoms of tested individuals: a machine learning approach. medrxiv hossain countries are clustered but number of tests is not vital to predict global covid-19 confirmed cases: a machine learning approach. medrxiv randomized placebo-controlled trials of remdesivir in severe covid-19 patients: a systematic review and meta-analysis. medrxiv center for disease control and prevention. covid-19 testing overview scikit-learn: machine learning in python world population review. gross domestic product list of geographic centers of the united states population distribution by gender population distribution by race agency for healthcare research and quality. health care quality: how does your state compare? 2013 ahar: part 1-pit estimates of homelessness in the u cdc covid data tracker covid-19 cases covid19 us lockdown dates dataset united states census. state population by characteristics list of the busiest airports in the united states center for disease control and prevention. previous u.s. viral testing data nyc-covid19 borough level breakdown scikit-learn-preprocessing -kbinsdiscretizer machine learning: a review of classification and combining techniques support vector machine stochastic gradient descent scikit learn developers (bsd license) scikit learn developers (bsd license) scikit learn developers (bsd license). naive bayes scikit learn developers (bsd license) multiple linear regression support vector machine-a survey stochastic gradient descent an overview of gradient descent optimization algorithms a local mean-based k-nearest centroid neighbor classifier simplifying decision trees. international journal of man-machine studies an empirical study of the naive bayes classifier scikit-learn classifier tuning from complex training sets covid-19 testing: understanding the "percent positive the news tribune. washington state reports 455 new covid-19 cases if trends persist, houston would become the worst affected city in the us, expert peter hotez says dph reports almost 900 new cases of covid-19 in ga hundreds test positive for covid-19 at tyson foods plant in arkansas covid-19 cases rise as hospitalizations remain low in colorado utah confirms 394 new coronavirus cases; 3 more deaths on sunday the authors would like to acknowledge the editor/reviewers for critically assessing the materials and providing suggestions that significantly improved the presentation of the paper. furthermore, they acknowledge the department of computer science, virginia commonwealth university for its computational resources. validation: satyaki roy.visualization: satyaki roy. writing -review & editing: preetam ghosh. key: cord-032227-xxa0hlpu authors: pyszczynski, tom; lockett, mckenzie; greenberg, jeff; solomon, sheldon title: terror management theory and the covid-19 pandemic date: 2020-09-17 journal: j humanist psychol doi: 10.1177/0022167820959488 sha: doc_id: 32227 cord_uid: xxa0hlpu terror management theory is focused on the role that awareness of death plays in diverse aspects of life. here, we discuss the theory’s implications for understanding the widely varying ways in which people have responded to the covid-19 pandemic. we argue that regardless of whether one consciously believes that the virus is a major threat to life or only a minor inconvenience, fear of death plays an important role in driving one’s attitudes and behavior related to the virus. we focus on the terror management theory distinction between proximal defenses, which are activated when thoughts of death are in current focal attention and are logically related to the threat at hand, and distal defenses, which are activated when thoughts of death are on the fringes of one’s consciousness and entail the pursuit of meaning, personal value, and close relationships. we use this framework to discuss the many ways in which covid-19 undermines psychological equanimity, the diverse ways people have responded to this threat, and the role of ineffective terror management in psychological distress and disorder that may emerge in response to the virus. although there are many disturbing aspects of the covid-19 pandemic, from the perspective of terror management theory (tmt; greenberg et al., 1986; solomon et al., 2015) , the enormous death toll and highly contagious nature of the virus play especially important roles in spawning the diverse forms of turmoil that have resulted from this crisis. we argue that the salience of death brought on by covid-19 plays a central role in driving the attitudes and behavior of even those who believe that the dangers of the virus have been vastly exaggerated. tmt is focused on the pervasive role death awareness plays in human affairs. much of what has been learned from the past 35 years of research applying tmt to diverse aspects of human behavior is directly relevant to understanding responses to the pandemic, especially the distinction between proximal defenses, which are directly focused on the problem of death, and distal defenses, which bear no logical relationship to death but enable people to construe themselves as valuable contributors to a meaningful, significant, and permanent universe (pyszczynski et al., 1999) . in this article we discuss how the salience of death inherent in covid-19 influences diverse reactions to this pandemic. tmt (greenberg et al., 1986; pyszczynski et al., 2015) posits that an inherent consequence of humankind's sophisticated cognitive abilities is awareness of the inevitability of death. awareness of death in an animal with an inherent proclivity for self-preservation gives rise to an ever-present potential for existential terror. this potential for terror is managed by an anxiety-buffering system consisting of cultural worldviews, self-esteem, and close interpersonal relationships. cultural worldviews are shared beliefs about reality that provide answers to basic questions about life, standards for valued behavior, and the promise of literal or symbolic immortality to those who live up to these standards. literal immortality beliefs provide hope that life will continue after physical death, as exemplified by afterlife concepts such as heaven, reincarnation, or joining with ancestral spirits. symbolic immortality comes from contributing to something greater than oneself that will continue long after one has died, such as a family, nation, or the memories of others. selfesteem is a sense of personal value that results from believing that one is living up to the standards of one's cultural worldview. close relationships provide consensual validation of one's worldviews and self-esteem needed to maintain confidence in them, as well as providing security in their own right (mikulincer et al., 2003) . tmt posits that people manage the potential for anxiety inherent in awareness of the inevitability of death by maintaining faith in their cultural worldviews, self-esteem, and close relationships; these anxiety-buffering systems mitigate existential terror by imparting a sense that one is a person of value living in a meaningful world (for a more thorough presentation of these ideas, see solomon et al., 2015) . research has supported a network of converging hypotheses derived from tmt. this research shows that (1) reminders of death (mortality salience) increase commitment to one's worldview, self-esteem, and relationships, and increase defense of these entities when threatened; (2) bolstering self-esteem, worldview, or relationships makes one less prone to anxiety and anxiety-related behavior in response to threats; (3) threats to worldview, self-esteem, and relationships increase the accessibility of deathrelated thoughts; and (4) self-esteem striving, cultural worldview defense, or affirming close relationships in response to mortality salience reduce death thought accessibility and the need for further terror management defenses; this suggests that the three components of the anxiety buffer are psychologically interchangeable (hart et al., 2005) . for a recent review of the tmt literature, see pyszczynski et al. (2015) . meta-analyses have found strong evidence that reminders of death increase commitment to one's worldview (burke et al., 2010) and that threats to one's worldview increase the accessibility of death-related thoughts (steinman & updegraff, 2015) . tmt posits that people manage death anxiety with two distinct systems, referred to as proximal and distal defenses (pyszczynski et al., 1999) . when death-related thoughts are conscious (in current focal attention), proximal defenses are activated to suppress such thoughts or push death into the distant future by denying one's vulnerability to things that could kill, or intending to engage in healthier behavior to ensure a longer life. however, when deathrelated thoughts are on the fringes of consciousness (no longer in focal attention but still highly accessible), people activate distal defenses focused on maintaining faith in their cultural worldview and enhancing self-esteem. conscious awareness of death requires defensive maneuvers that "make sense," in that they imply that death is not a problem until many years in the distant future. but proximal defenses do little to quell anxiety stemming from the ultimate inevitability of death. these concerns are assuaged by distal defenses that are logically unrelated to death but imbue one's life with meaning, value, and the promise of either literal or symbolic immortality. research has shown that proximal defenses emerge shortly after reminders of death and that distal defenses emerge in response to death reminders only after a delay and distraction; however, distal defenses emerge immediately with no need for delay and distraction when death reminders are presented subliminally and thus bypass conscious attention. research has also shown that distal defenses reduce the accessibility of death-related thoughts, which is presumably how they manage anxiety (see arndt et al., 2002) . the personal, social, economic, and political costs of the covid-19 crisis are unprecedented. from the perspective of tmt, the root cause of all these problems is glaringly obvious-the risk of dying from the virus. regardless of how contagious and lethal the virus ultimately turns out to be, or what one consciously thinks about it, the possibility of dying from it is highly salient and evident in ever-increasing death toll statistics, vivid images of overburdened hospitals and makeshift morgues, and the testimonials to victims of the virus, both famous and unknown. the deadly disease is spawned by an invisible pathogen that is conveyed by droplets expelled in the breath of its victims and thus might be lurking almost anywhere. since early march, media coverage of the pandemic in the united states and europe has been virtually nonstop, interrupted only by coverage of a looming global economic collapse, lethal police violence against african americans, and the protests and social upheaval in response to it. george floyd's death and related tragedies triggered a reinvigorated focus on social and economic injustice in american society, including peaceful protests and organized campaigns calling for police reform and greater support for the black lives matter movement. however, media coverage has also highlighted the violence, vandalism, looting, and general disarray spawned by the protests and in response to them (johnson, 2020; kilgo, 2020) . what has emerged from the co-occurrence of the pandemic and social upheaval is a constant barrage of threatening information. moreover, it is impossible to visit a social media website without being inundated with new and often contradictory information on the virus. covid-19 thus poses a ubiquitous dramatic reminder of vulnerability and death. because of its potential lethality, attempts to stem the tide of the virus or "flatten the curve" have led to closing many businesses and public venues and, consequently, led to loss of income and jobs, falling stock market values, general economic chaos, and social isolation, all of which seriously undermined major resources for managing the potential terror of death (fitzgerald et al., 2020) . public gatherings of all kinds were initially prohibited and then later strictly regulated, creating a void in personal contacts and near-total isolation for some (banerjee & rai, 2020) . information provided by governments, scientists, and the health care community has been confusing and sometimes contradictory, with partisan media outlets exacerbating the problem by providing narratives tailored to their constituents and critical of those with different ideological affiliations (bermejo et al., 2020; jurkowitz & mitchell, 2020) . these side effects of the pandemic seriously undermined all three components of the anxiety-buffering system that people use to maintain equanimity. the world has suddenly become an even more chaotic, confusing, and hostile place, in which death lurks around every corner, and people struggle to maintain meaning and self-esteem. people are living with the very real threat of death from the pandemic, combined with challenges to their worldviews, loss of jobs, impediments to career goals, and isolation from friends and family who normally validate one's significance. from a tmt perspective, it is currently far more difficult for virtually all of us to manage the terror of death. people have responded to the pandemic in a wide variety of ways, some rational and some less so, some adaptive, and some destructive. the terror management health model (goldenberg & arndt, 2008) applies tmt and the distinction between proximal and distal defenses to health-related behavior. it suggests that thoughts of death can increase either motivation for healthy behavior or denial and avoidance when people are consciously focusing on them. however, when such thoughts are on the fringes of consciousness, they increase behavior oriented toward maintaining self-esteem and faith in one's cultural worldview, which could either facilitate or undermine health. we now consider some of the ways in which people employ proximal and distal tactics to cope with covid-19. proximal defenses. tmt posits that when thoughts of death are in current focal attention, people attempt to remove them from their consciousness. this can entail simple suppression of such thoughts, denial of the threat, or engaging in behavior to reduce one's vulnerability. given the high level of media coverage, the changes in daily life that provide a constant reminder of the pandemic, and the extent to which virus-related concerns dominate conversations and media reporting, completely avoiding the issue is impossible. but there is evidence of increases in diversion-seeking behavior, such as alcohol consumption (furnari, 2020) , excessive eating (ammar et al., 2020) , and binge-watching television (dixit et al., 2020) . another form of proximal defense involves minimizing one's perception of the threat. this has taken the form of arguing that the virus is not nearly as contagious or lethal as health experts claim it to be (srikanth, 2020) , or that it is only lethal for the elderly or those already at-risk of dying from other diseases (fox et al., 2020) . others have trivialized the virus by comparing it to common illnesses such as the seasonal flu (ritter, 2020) , focusing on other common causes of death (mcginty, 2020), or viewing the publicity given the pandemic as originating in a politically motivated conspiracy (romano, 2020) . when sky-rocketing death statistics and vivid instances of contagion and mortality in the media make it hard to deny the problem outright, people sometimes claim that death rates are inflated to increase the funding hospitals receive (nunez, 2020) , or to bolster the aforementioned conspiracy to damage government leaders (brown, 2020; romano 2020) . another, likely more adaptive, form of proximal defense against covid-19 is to follow the prescriptions for avoiding infection provided by the medical community. this may be the most common proximal response to the pandemic; surveys suggest that 92% of people have followed guidelines for avoiding infection, to at least some extent (altman, 2020) . most people have engaged in some form of social distancing, increased sanitation practices such as hand washing and cleaning surfaces, wore masks in public places, and done other things to stay healthy (eanes, 2020) . but the economic and social effects of these measures interfere with feelings of value and connection with the world, the core way in which we distally quell concerns about our mortality. distal defenses. despite its ubiquitous nature, thoughts of the virus are not always the focus of conscious thought. this would be too disturbing for most people to bear and could lead to the emergence or exacerbation of psychological disorders. in addition, the proximal defenses that people employ are likely to be at least somewhat effective in removing thoughts of the pandemic from our consciousness. the bulk of the tmt literature suggests that distal defenses focused on affirming one's cultural worldview and maximizing selfesteem emerge when thoughts of death are highly accessible but not in focal attention; given the potential consequences of the virus and the enormous amount of attention the pandemic has attracted, this is likely to be the case for many people a great deal of the time. survey research provides clear evidence of a partisan divide in attitudes and behavior related to the virus. liberals tend to view the virus as much more dangerous than conservatives, report considerably more personal distress about it, and have greater confidence in what scientists and medical professionals have to say about it (funk et al., 2020; ritter, 2020) . conservatives, on the other hand, view the virus as less dangerous and are more likely to assign blame to china and other foreigners and view the virus as part of a conspiracy to discredit donald trump (romano, 2020) . the rapid emergence of polarized liberal and conservative narratives about the virus illustrate the dynamic interplay between individual psychological forces and cultural worldviews that is central to the tmt analysis of the relationship between individual and cultural psychology. this political divide is undoubtedly exacerbated by the accessibility of death-related cognition caused by the pandemic. though surveys have documented this divide since before president trump was elected, there is an even wider divergence regarding his overall handling of the pandemic (bycoffe et al., 2020) , attitudes toward those who have pushed back against some of his policies (spangler, 2020) , and dr. anthony fauci, who was at one time the major voice for the administration but later voiced some disagreements (brewster, 2020) . a political divide is also evident in attitudes toward easing restrictions and reopening businesses and public places, with conservatives much more in favor of such policies than liberals. whereas liberals tend to approve of societal restrictions to prevent the spread of the virus, conservatives tend to view them as unwarranted infringements on freedom and not worth the cost to the economy and individual incomes (shepard, 2020) . in many u.s. cities, protests against government restrictions were primarily attended by conservatives, some brandishing assault rifles and white nationalist symbols (mauger, 2020; perrett, 2020) . despite initial sentiment that "we're all in this together," the pandemic has become yet another domain for ideological division. from a tmt perspective, reminders of death motivate people to affirm their worldviews, and political ideology is a central element of worldviews for many people. though some studies show that mortality salience leads to a shift toward more conservative attitudes regardless of political orientation (cohen et al., 2017; landau et al., 2004) , others show it leads to polarization, with conservatives endorsing more conservative attitudes and liberals endorsing more liberal ones (kosloff et al., 2016) . a meta-analytic review of this literature concluded that there is evidence for both tendencies, with the evidence being somewhat stronger for polarization (burke et al., 2013) . we are seeing this polarization playing out in both proximal and distal reactions to the threat of the virus. one current example of intensified reactions may be the powerful and sometimes violent protests in response to the killing of george floyd. this was far from the first unjust killing of a black person by the police, but it has clearly led to the most intense and widespread outrage and protests of any of them. perhaps it is the final tragic straw that broke the proverbial camel's back that occurred at a time when people had more time to engage due to shutdowns in response to the virus. but we argue that the background of death thought accessibility due to the pandemic probably intensified these reactions. fueled by a greater need for terror management, many people jumped fervently onto this cause as a way to feel that they are doing something of value in their lives, when in reality their ability to feel that way has been so hampered by loss of jobs and income, social isolation, and difficulties in making sense of the tragedy and divisiveness that has emerged in the wake of the virus. meaning and significance derived from participating in these mass social protests may ironically increase death salience as protestors gather in large groups that exponentially increase their chance of exposure to the virus. protests are also threatening, in that-though most have been peacefulthere is a lurking potential for violence with police and counter-protesters. many people have suffered serious, life-altering injuries by "nonlethal weapons" used by police officers to control protests and riots; for example, some individuals have been permanently blinded after being shot with rubber bullets (bauerlein & calvert, 2020; sheikh & montgomery, 2020) . interestingly, the individuals who are risking their lives to protest racial inequality tend to hold the same political views as those who believe that social distancing should be practiced (diamond, 2020; nguyen, 2020) . thus, these individuals are in a precarious position, where bolstering one's cultural worldview by protesting racial inequality involves directly, consciously putting oneself in danger of violence and disease. tmt suggests that affirming one's worldview-along with one's self-esteem and close relationships-maintains psychological well-being by buffering death-related thoughts by providing outlets for symbolic immortality. the current crisis raises the intriguing question of what happens when affirming one's worldview involves putting oneself directly in harm's way and, consequently, never fully removing the salience of death from one's consciousness. consistent with the theoretical writings of becker (1973) , lifton (1979) , and yalom (1980) , tmt suggests that when people are not effectively managing their existential terror by building a meaningful and purposeful life, death anxiety and maladaptive ways of dealing with that anxiety are the common result. indeed, it has been argued that both death anxiety and ineffective or disrupted anxiety-buffer functioning are transdiagnostic vulnerability factors for psychological disorder (iverach et al., 2014; yetzer & pyszczynski, 2018) . if fear of death does indeed motivate the pursuit of meaning in life, selfesteem, and close relationships, then problems in managing death concerns exacerbated by the pandemic would leave people overwhelmed with anxiety and therefore more vulnerable to psychological disorder. experimental research has shown that reminders of mortality exacerbate phobias, obsessive-compulsive behaviors, depressive affect, and anxiety (e.g., finch et al., 2016; menzies & dar-nimrod, 2017; mikulincer et al., 2020; strachan et al., 2007) . this may help explain why a recent review found that the pandemic is associated with increased reports of anxiety, depression, and stress (torales et al., 2020; wang et al., 2020) . the covid-19 pandemic might cause psychological distress in two ways that correspond with proximal and distal defenses against death-related thought. first, the pandemic has directly increased death anxiety by raising awareness of personal vulnerability; recent research shows that the pandemic has increased anxiety and fear regarding one's physical well-being (jungmann & witthöft, 2020) . in addition, maladaptive proximal defenses may entail harmful practices aimed at avoiding the virus; some people have gargled bleach and cleaning supplies to reduce their chances of catching the virus (gharpure et al., 2020) . people are also employing a variety of unhealthy distractions to shift their focus away from the threat of the virus, including opiate use (https://www.ama-assn.org/system/files/2020-06/issue-briefincreases-in-opioid-related-overdose.pdf) and gambling (https://www.marketwatch.com/story/online-poker-betting-hits-a-record-high-during-the -pandemic-2020-05-29). the pandemic has also undermined distal defenses by hampering or eliminating the anxiety-buffering outlets that people typically rely on to believe that they are valuable contributors to a meaningful world. when people lose their jobs and cannot pursue their financial, educational, and career goals, they are losing important sources of self-esteem. social relationships, which play such a major role in managing death fears, have also been hampered by the lockdown and social distancing measures. single people looking for a potential life partner have largely had to put this pursuit on hold. covid-19related stress resulting from all of these aspects of the pandemic is associated with lower levels of meaning in life and life satisfaction (trzebiński et al., 2020) . inadequate distal defenses are likely to affect the need for proximal defenses and vice versa. increased death awareness associated with the threat of covid-19 is difficult to successfully manage because covid-19 has undermined access to many aspects of people's anxiety buffers; compromised anxiety buffers leave people vulnerable to experiencing higher levels of death anxiety than usual. how might one manage these overwhelmed death-related defenses? understanding the existential threats associated with the pandemic and reflecting on the proximal and distal defenses one uses to cope with them may help people develop new coping skills in these unprecedented times. for individuals experiencing high levels of death and health anxiety, managing one's engagement with virus-related information may help reduce explicit death anxiety. a recent study found that social media exposure during the pandemic is associated with poorer mental health as it likely contributes to the persistent salience of the virus and its mortality threat (gao et al., 2020) . engaging in and acknowledging the efficacy of best practices for avoiding infection is another potentially useful strategy for reducing anxiety. feelings of meaninglessness resulting from the loss of social relationships and selfesteem sources may be addressed by finding new sources of meaning, significance, and interpersonal connection; preferably ones that don't increase the threat of contracting or spreading the virus. reported increases in homebased hobbies such as baking bread or exercise have become popular as ways to derive a sense of meaning and value during the pandemic (vanderwerff, 2020). social events that have been cancelled have sometimes been redesigned as covid-friendly occasions-for example, the increasing popularity of drive-in theaters; online education; outdoor activities, such as hiking, where one can socialize while maintaining distance; and virtual get-togethers for parties, weddings, and funerals. "the new normal" has quickly become a cliché in light of the pandemic, yet it succinctly describes people's adaptations for both avoiding the threat of death from the virus while still being able to pursue goals that give life meaning, value, and connection with others. acknowledging one's emotional distress in response to the pandemic may encourage more creative and constructive ways of coping with it. the tension between measures to keep us safe from this oft-deadly virus and the desire to reopen the economy and resume "normal" life can be viewed as a battle between proximal and distal defenses against death. proximally, we want to forestall death and feel safe from it in the short term. distally, we want to maintain the view that life is meaningful and that we are valuable contributors to that meaningful life. the fundamental dilemma is that measures that keep us safe in the moment often interfere with our ability to find meaning and significance in our lives. both are important psychological concerns, and finding the right compromise to sufficiently meet both needs is the great challenge every culture is facing. one tragic example of this is that to keep hospitals and nursing homes safe, loved ones are often not allowed to be with their sicker or dying friends and family members. the result is people facing their own and their loved ones' imminent death without the support systems that provide them with their deepest psychological security. perhaps understanding these issues from the perspective of tmt can help societies determine the best versions of the many compromises with which they contend as they move forward with life in the face of the covid-19 pandemic. tom pyszczynski, phd, is distinguished professor of psychology at the university of colorado at colorado springs. he received his phd in psychology from the university of kansas in 1980. with his colleagues jeff greenberg and sheldon solomon, he developed terror management theory, which explores the role of death in life and suggests that cultural worldviews, self-esteem, and close personal relationships function to manage the potential for existential terror that results from the uniquely human awareness of the inevitability of death. he has also conducted research on clinical problems such as anxiety, depression, and posttraumatic stress disorder. he is coauthor or coeditor of several books, including hanging on and letting go: understanding the onset, progression, and remission of depression (1994) , in the wake of 9/11: the psychology of terror (2003) mckenzie's research endeavors focus on the social and emotional outcomes of trauma exposure. in particular, mckenzie often applies social psychological theories, including terror management theory and objectification theory, to understanding how trauma exposure relates to social and motivational processes that are typically only studied in nonclinical samples. most americans are practicing social distancing effects of covid-19 home confinement on physical activity and eating behavior: preliminary results of the eclb-covid19 international online survey mortality salience and the spreading activation of worldview-relevant constructs: exploring the cognitive architecture of terror management social isolation in covid-19: the impact of loneliness serious eye injuries at protests spur calls to ban rubber bullets the denial of death coverage of covid-19 and political partisanship: comparing across nations fauci loses support from republicans after trump criticism, poll shows fact check: coronavirus did not spread in the us because of an anti-trump conspiracy death goes to the polls: a metaanalysis of mortality salience effects on political attitudes two decades of terror management theory: a meta-analysis of mortality salience research how americans view the coronavirus crisis and trump's response you're hired! mortality salience increases americans' support for donald trump suddenly, public health officials say social justice matters more than social distance binge watching behavior during covid 19 pandemic: a cross-sectional, crossnational online survey around 50% of people are wearing face coverings because of coronavirus, survey finds. the news and observer terror mismanagement: evidence that mortality salience exacerbates attentional bias in social anxiety an instant economic crisis: how deep and how long? mckinsey & company covid-19 mainly kills old people: so do most other diseases. the print trust in medical scientists has grown in u.s., but mainly among democrats are americans drinking their way through the coronavirus pandemic? forbes mental health problems and social media exposure during covid-19 outbreak knowledge and practices regarding safe household cleaning and disinfection for covid-19 prevention-united states the implications of death for health: a terror management health model for behavioral health promotion the causes and consequences of a need for self-esteem: a terror management theory attachment, self-esteem, worldviews, and terror management: evidence for a tripartite security system death anxiety and its role in psychopathology: reviewing the status of a transdiagnostic construct what we're missing when we condemn "violence" at protests health anxiety, cyberchondria, and coping in the current covid-19 pandemic: which factors are related to coronavirus anxiety fewer now say media exaggerated covid-19 risks, but partisan gaps remain riot or resistance? how media frames unrest in minneapolis will shape public's view of protest. the conversation terror management and politics: comparing and integrating the "conservative shift" and "political worldview defense" hypotheses deliver us from evil: the effects of mortality salience and reminders of 9/11 on support for the broken connection: on death and the continuity of life protesters, some armed, enter michigan capitol in rally against covid-19 limits leading cause of death in the us? hint: it isn't covid-19 death anxiety and its relationship with obsessive-compulsive disorder the existential function of close relationships: introducing death into the science of love towards an anxiety-buffer disruption approach to depression: attachment anxiety and worldview threat heighten death-thought accessibility and depression-related feelings conservatives charge liberals with social-distancing hypocrisy trust index: are hospitals inflating covid-19 death count? click orlando why anti-lockdown protests are a "magnet" for white supremacists and far-right extremists a dual-process model of defense against conscious and unconscious death-related thoughts: an extension of terror management theory thirty years of terror management theory: from genesis to revelation republicans still skeptical of covid-19 lethality study: nearly a third of americans believe a conspiracy theory about the origins of the coronavirus rubber bullets and beanbag rrunds can cause devastating injuries. the new york times republican voters give trump and gop governors cover to reopen the worm at the core: on the role of death in life poll: michigan voters show support for gov why some people are still in denial over coronavirus-and how to convince them to take it seriously delay and death-thought accessibility: a meta-analysis terror mismanagement: evidence that mortality salience exacerbates phobic and compulsive behaviors the outbreak of covid-19 coronavirus and its impact on global mental health reaction to the covid-19 pandemic: the influence of meaning in life, life satisfaction, and assumptions on world orderliness and positivity how to bake bread immediate psychological responses and associated factors during the initial stage of the 2019 coronavirus disease (covid-19) epidemic among the general population in china existential psychotherapy. basic books terror management theory and psychological disorder: ineffective anxiety-buffer functioning as a transdiagnostic vulnerability factor for psychopathology phd, is a regents professor of the authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. the authors received no financial support for the research, authorship, and/or publication of this article. key: cord-252664-h02qy4z0 authors: kontis, v.; bennett, j. e.; parks, r. m.; rashid, t.; pearson-stuttard, j.; asaria, p.; guillot, m.; blangiardo, m.; ezzati, m. title: ageand sex-specific total mortality impacts of the early weeks of the covid-19 pandemic in england and wales: application of a bayesian model ensemble to mortality statistics date: 2020-05-25 journal: nan doi: 10.1101/2020.05.20.20107680 sha: doc_id: 252664 cord_uid: h02qy4z0 background: the covid-19 pandemic affects mortality directly through infection as well as through changes in the social, environmental and healthcare determinants of health. the impacts on mortality are likely to vary, in both magnitude and timing, by age and sex. our aim was to estimate the total mortality impacts of the pandemic, by sex, age group and week. methods: we developed an ensemble of 16 bayesian models that probabilistically estimate the weekly number of deaths that would be expected had the covid-19 pandemic not occurred. the models account for seasonality of death rates, medium-long-term trends in death rates, the impact of temperature on death rates, association of death rates in each week on those in preceding week(s), and the impact of bank holidays. we used data from january 2010 through mid-february 2020 (i.e., week starting 15th february 2020) to estimate the parameters of each model, which was then used to predict the number of deaths for subsequent weeks as estimates of death rates if the pandemic had not occurred. we subtracted these estimates from the actual reported number of deaths to measure the total mortality impact of the pandemic. results: in the week that began on 21st march, the same week that a national lockdown was put in place, there was a >92% probability that there were more deaths in men and women aged [≥]45 years than would occur in the absence of the pandemic; the probability was 100% from the subsequent week. taken over the entire period from mid-february to 8th may 2020, there were an estimated [~] 49,200 (44,700-53,300) or 43% (37-48) more deaths than would be expected had the pandemic not taken place. 22,900 (19,300-26,100) of these deaths were in females (40% (32-48) higher than if there had not been a pandemic), and 26,300 (23,800-28,700) in males (46% (40-52) higher). the largest number of excess deaths occurred among women aged >85 years (12,400; 9,300-15,300), followed by men aged >85 years (9,600; 7,800-11,300) and 75-84 years (9,000; 7,500-10,300). the cause of death assigned to the majority (37,295) of these excess deaths was covid-19. there was nonetheless a >99.99% probability that there has been an increase in deaths assigned to other causes in those aged [≥]45 years. however, by the 8th of may, the all-cause excess mortality had become virtually equal to deaths assigned to covid-19, and non-covid excess deaths had diminished to close to zero, or possibly become negative, in all age-sex groups. interpretation: the death toll of covid-19 pandemic, in middle and older ages, is substantially larger than the number of deaths reported as a result of confirmed infection, and was visible in vital statistics when the national lockdown was put in place. when all-cause mortality is considered, the mortality impact of the pandemic on men and women is more similar than when comparing deaths assigned to covid-19 as underlying cause of death. the covid-19 pandemic has led to tens of thousands of deaths among patients with confirmed infection in the uk. the pandemic has also profoundly changed the social, economic, environmental and healthcare determinants of morbidity and mortality. these changes are likely to impact public health beyond the deaths caused directly by infection through a number of routes 1 including delayed disease prevention and procedures for acute and chronic medical care; loss of jobs and income; disruption of social networks; changes in crime and self-harm; changes in quantity and quality of food, and the use of tobacco, alcohol and other drugs; and changes in mobility and transport patterns with potential impacts on road traffic injuries and air pollution. 2 how changes in these social, environmental and healthcare determinants impact mortality is likely to vary, in both magnitude and timing, by age and sex. an understanding of the mortality impacts beyond deaths assigned to covid-19 infection is needed to understand the overall public health impacts of the pandemic and control policies, titrate and adjust the response, and put in place mitigation mechanisms to minimize the adverse impacts as the pandemic and response continue beyond early weeks. we developed and applied methodology to quantify the weekly mortality impacts of the covid-19 pandemic and associated responses by age group and sex in england and wales. the methodology can also be used for comparable cross-country analysis on a real-time basis. we used data on the weekly number of deaths in england and wales, by age group and sex released by the office for national statistics (ons). weekly death files include all deaths registered from saturday through the subsequent friday. they also include data on the number of deaths involving covid-19, which are deaths with a mention of covid-19 anywhere on the death certificate, including in combination with other health conditions. we used data from first week in january 2010 through the week starting on saturday 2 nd may 2020 (ending 6 on 8 th may 2020). we used data on mid-year population by age group and sex from the ons. we calculated weekly population through interpolation, as done by the ons for quarterly population. 3 we obtained data on temperature from era5, 4 which uses data from global in situ and satellite measurements to generate a worldwide meteorological dataset, with full space and time coverage over our analysis period. we used gridded estimates measured four times daily at a resolution of 30 km to generate weekly temperatures for each local authority district, which we weighted by local authority population to create national level summaries. the total mortality impact of the covid-19 pandemic should be calculated as the difference between the observed number of deaths and the number of deaths had the pandemic not occurred, which is not directly measurable. the most common approach to addressing this issue has been to use the average number of deaths over previous years, e.g., the most recent five years, for the corresponding week or month when the comparison is made. 5 this approach however does not take into account changes in population size and age structure, nor longand short-term trends in mortality, which are particularly pronounced for some age groups. 6, 7 nor does this approach account for time-varying factors like temperature, that are largely external to the pandemic, but also affect death rates. in addition, bank holidays such as easter that do not occur in the same week of the year affect number of deaths and death registration. we developed an ensemble of 16 short-term bayesian mortality projection models that each make an estimate of weekly death rates that would be expected if the covid-19 pandemic had not occurred. we used multiple models because there is inherent uncertainty in the choice of model that best predicts death rates in the absence of pandemic. these models were formulated to incorporate features of weekly death rates as follows: • first, death rates may have a medium-to-long-term trend. [6] [7] [8] we developed two sets of models, one with no trend and one with a linear trend term over weekly deaths. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 7 • second, death rates have a seasonal pattern, which varies by age group and sex. [9] [10] [11] [12] we included weekly random intercepts for each week of the year. to account for the fact that seasonal patterns "repeat" (i.e., late december and early january are seasonally similar) we used a seasonal structure 13, 14 for the random intercepts. the seasonal structure allows the magnitude of the random intercepts to vary over time, and implicitly accounts for timevarying factors such as annual fluctuations in flu season. • third, death rates in each week may be related to those in preceding week(s). we formulated four sets of models to account for this relationship. the weekly random intercepts in these models had a first, second, fourth or eighth order autoregressive structure 13, 14 the higher order autoregressive models allow death rates in any given week to be informed by those in a progressively larger number of preceding weeks. further, trends not picked up by the linear or seasonal terms would be captured by these autoregressive terms. • fourth, beyond having a seasonal pattern, death rates depend on temperature, and specifically on whether temperature is higher or lower than its long-term norm during a particular time of year. [15] [16] [17] [18] [19] [20] the effect of temperature on mortality varies throughout the year, and may be in opposite directions for different times of year. we used two sets of models, one without temperature and one with a weekly term for temperature anomaly, defined as deviation of weekly temperature from the local average weekly temperature over the entire analysis period. the coefficients of temperature anomalies were specified as a random effect with a random walk prior of order one, so that temperature effect is more similar in adjacent weeks. the random effect had a circular structure so that late december and early january are treated as adjacent. • fifth, reported death rates in weeks that contain or follow a holiday may be different from other weeks. we included effects (as fixed intercepts) for the week containing and the week after each of the following holidays: christmas and/or boxing day; good friday; . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 8 easter monday; new year's day; early-may bank holiday; spring bank holiday; and summer bank holiday. • we also tested, but did not include, model terms for the weeks that coincided with a change to and from daylight saving time because the effect was negligible. these choices led to an ensemble 21 of 16 short-term bayesian mortality projection models (2 trend options x 4 autoregressive options x 2 temperature options). we used the time-series of weekly reported deaths from january 2010 through mid-february 2020 (i.e., week starting 15 th february 2020) to estimate the parameters of each model, which was then used to predict death rates for subsequent weeks as estimates of the counterfactual death rates (i.e., if the pandemic had not occurred). for the projection period, we used recorded temperature so that our projections take into consideration actual temperature in 2020. this choice of training and prediction periods assumes that the number of deaths that are directly or indirectly related to the covid-19 pandemic was negligible through mid-february 2020, which is about two weeks after the first confirmed case in the uk, but it allows for impacts to have appeared in subsequent weeks. we used weakly informative log gamma priors on log precision with both shape and rate equal to 0.01. we tested the sensitivity of the results to the choice of prior through the use of penalized complexity priors and found that the results were similar. all models were fitted using integrated nested laplace approximation (inla), 22 implemented in the r-inla software (version 20.03). we took 1,000 draws from the posterior distribution of age-specific deaths under each model, and pooled the 16,000 draws to obtain the posterior distribution of agespecific deaths if the covid-19 pandemic had not taken place. the reported credible intervals represent the 2.5 th and 97.5 th percentiles of the posterior distribution of the draws from the entire ensemble. this approach incorporates both the uncertainty of estimates from each model and the uncertainty in the choice of model. we did all analyses separately by sex and age group (0-14 years, 15-44 years, 45-64 years, 65-74 years, 75-84 years and 85+ years) because death rates, and how they are impacted by the pandemic, vary by age group and sex. to obtain estimates across age groups and both sexes, we summed draws from age-sex-specific estimates. for the purpose of reporting, we rounded results on number of deaths that are ≥1000 to the nearest hundred to avoid giving a false sense of precision in the presence of uncertainty; results <1000 are rounded to the nearest ten. we tested the performance of the projections from the model by withholding data for 11 weeks starting from mid-february (i.e., the same projection period as done for 2020) for an earlier year and used the preceding time-series of data to train the models. we then projected death rates for the weeks with withheld data, and evaluated how well the model ensemble projections reproduce the known-but-withheld death rates. we repeated this for three different years: 2017 (i.e. train model using data from january 2010 to mid-february 2017 and test for the subsequent 11 weeks), 2018 (i.e., train model using data from january 2010 to mid-february 2018 and test for the subsequent 11 weeks), and 2019 (i.e., train model using data from january 2010 to mid-february 2019 and test for the subsequent 11 weeks). we report the projection error (which measures systematic bias) and absolute forecast error (which measures any deviation from the data). additionally, we report coverage of the projection uncertainty; if projected death rates and their uncertainties are well estimated, the estimated 95% credible intervals should cover 95% of the withheld data. weekly mortality varied substantially over time in all ages with evidence of seasonal pattern above 45 years of age ( figure 1 ). from 22 nd february through 20 th march 2020, the observed number of deaths in every age group was well within the credible interval of what was expected to have occurred if the covid-19 pandemic had not taken place (figures 1 and 2) . in the week that began on 21 st march, the same week that a national lockdown was put in place, there was . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 25, 2020. . https://doi.org/10.1101/2020.05.20.20107680 doi: medrxiv preprint already a >92% probability that there were more deaths in both sexes and all age groups ≥45 years than would occur in the absence of the pandemic; the probability was 100% (i.e., every one of the 16,000 draws were positive) from the subsequent week (figures 2 and 3) . the same phenomenon occurred in those aged 15-44 years in the first week of april 2020. taken over the entire period from mid-february to 8 th may 2020 and across all age groups, 164,373 deaths were registered in england and wales. this represents an estimated ~ 49,200 (44,700-53,300) or 43% (37-48) more deaths than would be expected had the pandemic not taken place. 22,900 (19,300-26,100) of these deaths were in females (40% (32-48) higher than if there had not been a pandemic), and 26,300 (23,800-28,700) in males (46% (40) (41) (42) (43) (44) (45) (46) (47) (48) (49) (50) (51) (52) higher). the largest overall (i.e. from any cause) number of excess deaths occurred among women aged >85 years (12,400; 9,300-15,300), followed by men aged >85 years (9,600; 7,800-11,300) and 75-84 years (9,000; 7,500-10,300). the cause of death assigned to the majority (37,295) of these excess deaths was covid-19 ( figures 2 and 3) . nonetheless, there was a >99.99% probability that there has also been an increase in deaths assigned to other causes in those aged ≥45 years. the share of total excess deaths not assigned to covid-19 was higher in those aged 75 years and older and was particularly high in 85+ year olds. specifically, over the entire period from 22 nd february to 8 th may 2020, the share of non-covid excess deaths was 18%, 17%, 18% and 25% in men aged 45-64, 65-74, 75-84 and 85+ years, respectively, and 20%, 15%, 23% and 37% in women aged 45-64, 65-74, 75-84 and 85+ years. the number of non-covid excess deaths also increased with age, reaching 1,600 (180-3,000) and 1,500 (110-2,800) in men and women aged 75-84 years, respectively, and 2,400 (580-4,100) and 4,600 (1,400-7,400) in men and women aged 85+ years. however, by the 8 th of may, non-covid excess deaths had diminished to close to zero or possibly become negative in all age-sex groups. in boys and young men aged 0-14 and 15-44 years, there may have been a short-term decline, lasting two weeks in mid-late march, in non-covid deaths but the probability that deaths were lower than would be . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . expected without the pandemic in different weeks was <80%. similarly, in the first week of may, deaths in boys/men and girls/women aged <45 years may have dropped below what would be expected without the pandemic with posterior probabilities ranging from 69% to 89%. the results of model validation (table 2) show that the estimates of how many deaths would be expected in different weeks had the pandemic not occurred had mean projection errors <6% in all age-sex groups. the mean absolute error was also <9% in all age-sex groups except in those aged 0-14 years where the number of deaths is small. 95% coverage, which measures how well the posterior distributions of projected deaths coincide with withheld data was >93% for all age and sex groups which shows that the posterior distribution is wellestimated. we applied a robust probabilistic method to coherently and consistently estimate the total death toll of covid-19 pandemic from the 22 nd february to 8 th may. at this stage, the covid-19 pandemic was responsible for over 49,000 excess deaths in england and wales. we also found that when all-cause mortality is considered, the mortality impact of the pandemic on men (~46% increase in deaths) and women (~40% increase in deaths) is more similar than when comparing deaths assigned to covid-19. deaths that were not assigned to covid-19 made up 24% of all excess deaths, but had diminished in most age-sex groups by the first week of may. we also rule out, with virtual certainty (posterior probability >99.99%), the hypothesis that the death toll of the pandemic may be smaller than direct covid-19 deaths either because some of those dying of covid-19 would have died in this same time period from other underlying conditions even if the pandemic had not occurred, 23, 24 . further, although children and young adults may have experienced shot-term declines in deaths, taken across all age groups any "positive" impacts of the lockdown (e.g., reductions in deaths due to air pollution or traffic injuries) are dwarfed by the negative impacts of the epidemic. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . our overall estimates are similar to those reported by the ons 25 and the financial times 5 but our findings reveal important details on excess deaths by age group and sex which these sources do not. euromomo does not report country-specific excess deaths and hence could not be compared with our results. the main strength of our work is the systematic use of time-series data from 2010 to early 2020 to estimate how many deaths would be expected in the absence of pandemic. by modelling death rates, rather than simply the number of deaths as is done in most other analyses, we account for changes in population size and age structure. the models incorporated important features of mortality, including seasonality of death rates, how mortality in one week may depend on previous week(s) and the seasonally-variable role of ambient temperature. the use of a modelling framework, as we have done, allowed us to make estimates by age group and sex, and, as data become available, will allow doing so for specific causes of death and subnational geographies which would, because of smaller numbers, not be possible for other methods. we used an ensemble of models which typically leads to more robust projections and represent both the uncertainty associated with each individual model and that of model choice. 21 finally, this framework, specifically the inclusion of seasonality and ambient temperature, is well suited for more robust estimation and standardised comparisons of excess deaths across countries on a real-time basis. the main limitation of our work is that we did not have data on underlying cause of death beyond the distinction between covid-19 and non-covid deaths. having a breakdown of deaths by underlying cause will help develop cause-specific models and understand which causes have exceeded or fallen below the levels expected. we also could not access agespecific data by region because the ons only releases aggregate numbers for regions. nor did we have data on total mortality by socio-demographic status to understand inequalities in the impacts of the pandemic beyond deaths assigned to covid-19 as the underlying cause of death. releasing these data will allow more granular analysis of the impacts of the pandemic, . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . which can in turn inform resource allocation and a more targeted approach to mitigating both the direct and indirect effects of covid-19, now and for future waves of the pandemic. further, weekly mortality files from the ons cover deaths registered in any given week. these include some deaths from prior weeks and leave out some deaths in the reporting week. 26 however, the approach is consistent over time and does not affect year to year comparisons, including for 2020 as the lags in registration of deaths assigned to covid-19 seem to be the same as those from other causes. 27 it is likely that some of the apparently non-covid excess deaths are due to undetected covid-19 infections. 28 an example of such deaths are the likely covid-19 deaths in care homes. 29, 30 other such deaths may be those who called the nhs helpline or the ambulance service, were advised to self-isolate because their symptoms were not deemed sufficiently severe to be admitted, and died at home. 31, 32 that the share of excess deaths from non-covid causes became smaller over time may be because with increasing awareness of, and attention to, clinical symptoms more of such deaths are assigned to covid-19 as the underlying cause. it is also possible that many excess deaths have been caused, and may continue to do so in coming months, by the pandemic due to changes in personal and family economic and employment circumstances, and in healthcare provision, access and utilisation, as evidenced by reductions in a&e attendance and procedures for a diverse range of acute and chronic conditions. [33] [34] [35] [36] [37] [38] [39] [40] [41] while the official position on this has been to encourage people to seek care, the situation is more complex: for some accessing care becomes more restricted because their family members are infected and cannot continue supporting them. others, typically those in limited and marginalised housing and employment, may not do so in fear of losing their livelihood. the large death toll of the pandemic, from deaths assigned to covid-19 as well as other causes, together with the fact that excess death toll was already happening when a national lockdown was announced indicate that the cessation of community contact tracing in early . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . march, hesitation in putting a lockdown in place earlier, and sub-optimal identification and management of those with complex conditions in the community at increased risk from both the virus and effects of lockdown, is likely to have contributed to the substantial excess deaths. minimising these impacts requires a coherent strategy that supresses the epidemic and strengthens the social safety net and healthcare provision, in facilities as well as in the community and at home, together with transparent communication to encourage resumption of care seeking. we thank giulia mangiameli for help with background materials and references. all authors contributed to study design. vk and jeb developed and tested statistical methods with input from other authors. vk, rmp, tr and jeb accessed, harmonised and analysed data. vk conducted analysis and prepared results. me wrote the first draft of the paper and other authors contributed to the paper. me reports a charitable grant from the astrazeneca young health programme, and personal fees from prudential and scor, outside the submitted work. jp-s is vice-chair of the royal society for public health and reports personal fees from novo nordisk a/s and lane, clark & peacock llp, outside of the submitted work. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . https://doi.org/10.1101/2020.05.20.20107680 doi: medrxiv preprint q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q cumulative deaths. the grey shading shows the levels of credible intervals around the median prediction, from 5% (dark grey) to 95% (light grey). the red points show the number of deaths assigned to covid-19 as underlying cause of death. the difference between these points and the curves is excess non-covid-19 deaths. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 25, 2020. . https://doi.org/10.1101/2020.05.20.20107680 doi: medrxiv preprint q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q mitigating the wider health effects of covid-19 pandemic response and year-end review reanalysis datasets excess uk deaths in covid-19 pandemic top 50,000 financial times the future of life expectancy and life expectancy inequalities in england and wales: bayesian spatiotemporal forecasting contributions of diseases and injuries to widening life expectancy inequalities in england from 2001 to 2016: a population-based analysis of vital registration data contribution of six risk factors to achieving the 25x25 non-communicable disease mortality reduction target: a modelling study national and regional seasonal dynamics of all-cause and cause-specific mortality in the usa from seasonality of deaths in the u.s. by age and cause excess winter deaths in europe: a multicountry descriptive analysis deaths in winter: can britain learn from europe? forecasting: principles and practice applied bayesian modelling high ambient temperature and mortality: a review of epidemiologic studies from relation between elevated ambient temperature and mortality: a review of the epidemiologic evidence mortality risk attributable to high and low ambient temperature: a multicountry observational study impact of ambient temperature on morbidity and mortality: an overview of reviews vulnerability to the mortality effects of warm temperature in the districts of england and wales anomalously warm temperatures are associated with increased injury deaths bayesian model averaging: a tutorial approximate bayesian inference for latent gaussian models using integrated nested laplace approximations (with discussion) two thirds of coronavirus victims may have died this year anyway, government adviser says the telegraph office for national statistics. comparison of weekly death occurrences in england and wales: up to week ending 8 coronavirus death in california came weeks before first known u.s. death. the new york times coronavirus involved in quarter of care home residents' deaths in england and wales the guardian the health foundation. care homes have seen the biggest increase in deaths since the start of the outbreak london woman dies of suspected covid-19 after being told she was 'not priority'. the guardian the uber driver evicted from home and left to die of coronavirus coronavirus crisis could lead to 18,000 more cancer deaths, experts warn the guardian more than 2m operations cancelled as nhs fights covid-19. the guardian how coronavirus is impacting cancer services in the uk record drop in a&e attendance in england 'a ticking timebomb', say doctors the guardian patients with heart attacks, strokes and even appendicitis vanish from hospitals. the washington post covid's other casualties. reuters investigates response of cardiac surgery units to covid-19: an internationally-based quantitative survey key: cord-223212-5j5r6dd5 authors: hult, henrik; favero, martina title: estimates of the proportion of sars-cov-2 infected individuals in sweden date: 2020-05-25 journal: nan doi: nan sha: doc_id: 223212 cord_uid: 5j5r6dd5 in this paper a bayesian seir model is studied to estimate the proportion of the population infected with sars-cov-2, the virus responsible for covid-19. to capture heterogeneity in the population and the effect of interventions to reduce the rate of epidemic spread, the model uses a time-varying contact rate, whose logarithm has a gaussian process prior. a poisson point process is used to model the occurrence of deaths due to covid-19 and the model is calibrated using data of daily death counts in combination with a snapshot of the the proportion of individuals with an active infection, performed in stockholm in late march. the methodology is applied to regions in sweden. the results show that the estimated proportion of the population who has been infected is around 13.5% in stockholm, by 2020-05-15, and ranges between 2.5% 15.6% in the other investigated regions. in stockholm where the peak of daily death counts is likely behind us, parameter uncertainty does not heavily influence the expected daily number of deaths, nor the expected cumulative number of deaths. it does, however, impact the estimated cumulative number of infected individuals. in the other regions, where random sampling of the number of active infections is not available, parameter sharing is used to improve estimates, but the parameter uncertainty remains substantial. to understand the spread of the novel coronavirus, sars-cov-2, at an aggregate level it is possible to model the dynamic evolution of the epidemic using standard epidemic models. such models include the (stochastic) reed-frost model and more general markov chain models, or the corresponding (deterministic) law of large numbers limits such as the general epidemic model, see [3] . there is an extensive literature on extensions of the standard epidemic models incorporating various degrees of heterogeneity in the population, e.g. age groups, demographic information, spatial dependence, etc. these additional characteristics make the models more realistic. for instance, it is possible to evaluate the effect of various intervention strategies. more complex models also involve additional parameters that need to be estimated, contributing to a higher degree of parameter uncertainty. a problem when calibrating, even the standard epidemic models, to covid-19 data is that there are few reliable sources on the number of infected individuals. publicly available sources provide data on the number of positive tests, the number of hospitalizations, the number of icu admission and the number of deaths due to covid-19. in some cases, small random samples of an active infection may be available. for example, the swedish folkhälsomyndigheten performed such a test in stockholm with about 700 subjects in early april 2020. moreover, there is still no consensus in the literature on the value of important parameters such as the basic reproduction number r 0 and the infection fatality rate. a useful approach to incorporate the parameter uncertainty in the models is to consider a bayesian framework. in the bayesian approach parameter uncertainty is quantified by prior distributions over the unknown parameters. the impact of observed data, in the form of a likelihood, yields, via bayes' theorem, the posterior distribution, which quantifies the effects of parameter uncertainty. the posterior can be used to construct estimates on the number of infected individuals, predictions on the future occurrence of infections and deaths, as well as uncertainties in such estimates. in this paper an seir epidemic model with time-varying contact rate will be used to model the evolution of the number of susceptible (s), exposed (e), infected (i), and recovered (r) individuals. a time varying contact rate is used to capture heterogeneity in the population, which causes the rate of the spread of the epidemic to vary as the virus spreads through the population. moreover, the time varying contact rate allows modeling the effect of interventions aimed at reducing the rate of epidemic spread. a poisson point process is introduced to model the occurrence and time of deaths. random samples of tests for active infections are treated as binomial trials where the success probability is the proportion of the population in the infectious state. the methods are illustrated on regional data of daily covid-19 deaths in sweden. it is demonstrated that, by combining the information in the observed number of deaths and random samples of active infections, fairly precise estimates on the number of infected individuals can be given. by assuming that some parameters are identical in several regions, estimates for regions outside stockholm can also be provided, albeit with greater uncertainty. our approach is inspired by [4] where the authors considers a bayesian approach to model an influenza outbreak. the main extensions include the introduction of the poisson point process to model the occurrence of deaths, the addition of random sampling to test for infection, and an extension to multiple regions. to evaluate the posterior distribution we employ markov chain monte carlo (mcmc) sampling. samples from the posterior are obtained using the hamiltonian monte carlo algorithm, nuts, by [9] , implemented in the software stan, which is an open source software for mcmc. to model the spread of the epidemic we consider the deterministic seir model [1, 5] , which is a simple deterministic model describing the evolution of the number of susceptible, exposed, infected, and recovered individuals in a large homogeneous population with n individuals. the epidemic is modeled by {(s t , e t , i t ), t ≥ 0}, where s t , e t and i t represent the number of susceptible, exposed and infected individuals at time t, respectively. the total number of recovered and deceased individuals at time t ≥ 0 is always given by n − s t − e t − i t . the epidemic starts from a state s 0 , e 0 , i 0 with s 0 + e 0 + i 0 = n , and proceeds by updating, the parameters are the contact rate β > 0, the rate ν of transition from the exposed to the infected state and the recovery rate γ > 0. note that i t represents the number of individuals with an active infection at time t, whereas n − s t is the cumulative number of individuals who have been exposed, and possibly infected, recovered or deceased, up until time t. in the context of the covid-19 epidemic the contact rate cannot be assumed to be constant, primarily due to interventions implemented in the early stage of the epidemic. moreover, as the seir model describes the evolution at an aggregate level, a time varying contact rate may be used to capture inhomogeneities in the population. if, for example, the epidemic is initiated in a rural area the contact rate may be rather low, but as the epidemic reaches major cities the contact rate will be higher. the resulting seir model with time varying contact rate is given by clearly, one needs to put some restriction on the amount of variation of the contact rate. in this paper a gaussian process prior will be used on the log contact rate, which restricts the amount of variation in time, but is sufficiently flexible to capture the reduction in contact rate after the interventions. when observations on the number of infected and recovered individuals are available, the model (1) can be fitted to these observations. in the context of covid-19, observations on the number of infected and recovered individuals are unavailable. there are many symptomatic individuals who are not tested and potentially a large pool of asymptomatic individuals. in this paper we will rely on the number of registered deaths due to covid-19 to calibrate the model. in addition we will incorporate the test results from a random sample that provides a snapshot on the number of individuals with an active infection. to model the occurrence of deaths due to covid-19 we consider the following poisson point process representation. we refer to [10] for details on poisson point processes. let f denote the infection fatality rate, that is, the probability that an infected individual eventually dies from the infection. consider the number of individuals that enters the infected state on day t, that is, νe t . each such infected individual has probability f to eventually die from the infection. conditional on death due to the infection, the time from infection until death is assumed independent of everything else and follows a probability distribution with probability mass function p s d . each individual that dies may be represented as a point (t, τ ) in e := {(t, τ ) ∈ n 2 : τ ≥ t}, where t denotes the time of entry to the infected state and τ the time of death of the individual. the number of deaths at time τ can then be computed by counting the number of points on the line ∪ τ t=0 (t, τ ). the number of deaths, and the corresponding time of infection and time of death is conveniently modelled by a poisson point process on e. let ξ be a poisson point process on e with intensity we may interpret a point at (t, τ ) of the poisson point process as the time of infection, t, and the time of death, τ , of an individual who dies from the infection. the number of deaths d τ that occurs at time τ is then given by summing up all the points of the point process on the row corresponding to τ , ξ(∪ τ t=0 (t, τ )). since the rows are disjoint this implies that d 0 , d 1 , . . . are independent with each d τ having a poisson distribution with parameter throughout this paper p s d is the probability mass function of a negative binomial distribution with mean s d . more precisely, a parametrization of the negative binomial distribution with parameters r, s d will be used, where the value r = 3 will be used throughout as this fits well with the distribution of observed duration from symptoms to death in the study by [15] . in this section we provide the assumptions on the prior distributions and derive the expression of the likelihood of the model. note that λ τ is a function of all the parameters of the model, θ = ({β t }, ν, γ, s 0 , f, s d ). the parameters, their interpretation and prior distribution are summarized in table 1 . actually, since the contact rate is positive, a gaussian process (gp) prior will be used for the natural logarithm of the contact rate, denoted log-gp in the sequel. the gaussian process has a constant mean µ and a squared-exponential covariance kernel k with parameters α, ρ, δ such that to compute the likelihood the observed number of daily deaths, d 0 , d 1 , . . . , d t , table 1 . specification of the parameters and prior distributions. will be used, in combination with a random sample of n 0 tests for active infection, performed at a time t 0 , when such test result is available. the number of individuals z with positive test result has a bin(n 0 , i t0 /n ) distribution. the full likelihood is given by: the joint prior is the product of the marginal priors and leads, by bayes's theorem, to the posterior, the expected number of daily deaths λ τ , the cumulative number of deaths and the cumulative number of infected individuals n −s t are all functions of θ and their distribution can therefore be inferred from the posterior p θ|d,z . by sampling from p θ|d and iterating the dynamics (1) estimates of these quantities may be obtained along with the effects of parameter uncertainty. moreover, predictions on the future development of the above mentioned quantities can be obtained by extrapolating the contact rate into the future. as the posterior distribution is unavailable in explicit form it is necessary to employ monte carlo methods. in the next section markov chain monte carlo methods are briefly described to sample from the posterior. 4.1. multiple regions. the seir model (1) and the derivation of the likelihood (3) considers a single region. in the context of multiple regions it may be reasonable to assume that some parameters are identical. for example, when considering multiple regions of sweden below it will be assumed that the rate, ν, from exposed to infected, the recovery rate, γ, the infection fatality rate, f , and the duration, s d are identical in all regions. it is tempting to include interaction terms between the regions as infected individuals from one region may travel to another region and cause new infections. in this paper, it will be assumed that each region has its own time varying contact rate that incorporates fluctuations in new infections due to import cases from other regions. the likelihood from multiple regions is simply the product of the marginal likelihood for the individual regions and the prior is the product of the marginal priors for each parameter. thus, for two regions the prior will be the product of two gaussian process priors for the respective log contact rates for the two regions and the product of the marginal priors for the remaining parameters. markov chain monte carlo (mcmc) methods in bayesian analysis aims at sampling from the posterior distribution. this is non-trivial because the marginal distribution of the data, which acts as normalizing constant of the posterior is practically impossible to compute. in mcmc algorithms the posterior is represented as a target distribution. the algorithms rely on the construction of a markov chain whose invariant distribution is the target distribution. standard mcmc methods are based on acceptance-rejection steps, where random proposals are accepted or rejected with a probability that does not require knowledge of the normalizing constant, e.g., metropolis-hasting and gibbs sampling [13, 8, 6] . when the target distribution is complex and multi-modal, standard methods may lead to poor mixing of the markov chain and slow convergence to the target distribution. to overcome slow mixing of the markov chain gradient-based sampling can be applied, which adapt the proposal distribution based on gradients of the target, see e.g. [2] . in this paper we will employ a hamiltonian monte carlo sampler, the no-u-turn sampler (nuts) by [9] in combination with automatic differentiation to numerically approximate the gradients [7] , which is implemented in the open source software stan. in this section the estimates of the number of infected individuals and predictions on the evolution of the number of deaths and number of infected individuals are provided for ten regions of sweden. the epidemic is considered to start on 2020-03-01 and interventions in sweden began on 2020-03-16. the joint prior distribution is the product of the marginal priors, and the hyper-parameters are specified in table 2 . the choices of hyper-parameter values are made in line with existing literature on the covid-19 epidemic. as a general principle we have used informative priors on the parameters ν, γ, and s d , whereas the priors on the time-varying contact rate {β t } and the fatality rate f are uninformative. folkhälsomyndigheten reports that the incubation period is usually around 5 days, which corresponds to 1/ν ≈ 5. similarly the expected time to recovery is around 14 days, 1/γ ≈ 14. the overall infection fatality rate f is estimated to be in the range 0.003 − 0.013, see [14, 15] . however, since the infection fatality rate is a very important parameter we have used an uninformative prior and simply use a uniform prior, beta(1, 1). the expected duration from symptoms to death is around 16 days, see [15] . samples from the posterior are obtained using the nuts-sampler with a burn-in period of 500 samples and 5 000 samples after burn-in. mean prior 95%-c.i. individuals performed between 2020-03-27 and 2020-04-03 1 . it showed that 18 individuals carried the sars-cov-2 virus. these results are included in the analysis as a binomial sample of size n 0 = 707 and success probability i t0 /n where the test date, t 0 , is assumed to be 2020-03-30. a summary of the marginal posterior distributions is provided in table 3 . the posterior distribution of the time varying contact rate is illustrated in figure 1 . note that although there is great uncertainty about the initial contact rate, the model clearly picks up the reduction in contact rate after the interventions began on 2020-03-16. the contact rate is gradually reduced around the time of intervention and then remains at a low level. this slow reduction of the contact is, however, not due to stiffness of the gaussian process kernel. we have experimented with a sharp break-point in the contact rate at the time of intervention, but it did not provide more accurate results. on the contrary, the data suggests that the reduction of the contact rate is slow. the contact rate is estimated until 2020-05-01. after this date the posterior is unreliable. this is because many of the deaths of individuals who are infected after 2020-0501 have not yet been observed. for this reason, the contact rate is only estimated until 2020-05-01. to perform estimates and predictions on the future number of daily and cumulative infections and deaths, the contact rate has been extrapolated from its value on 2020-05-01. the posterior distribution suggests that the contact rate is constant, at a low rate, since roughly 2020-04-07, which motivates extrapolation into the future, assuming that the interventions remains at the present level. after 2020-05-01 the contact rate is extrapolated, by assuming it will remain constant. figure 2 (top left) shows the observed daily number of deaths (black dots) along with the posterior median (dark red) and 95% credibility interval (red) for the expected number of daily deaths. figure 2 (top right) shows the observed cumulative number of deaths (black dots) along with the posterior median (dark red) and 95% credibility interval (red) for the expected cumulative number of deaths. we observe that the parameter uncertainty does not substantially impact the expected number of daily deaths and the peak of the daily number of deaths appears to have occurred by mid april. similarly, the expected cumulative number of deaths in stockholm is likely to terminate slightly above 2000. we emphasize that this is the expected number of deaths, λ τ . since we are considering a poisson distribution for the number of daily deaths an approximate 95%-prediction interval would be λ τ ± 2 √ λ τ , where λ τ is the poisson parameter on day τ . note from the observed number of daily deaths that the empirical distribution of daily deaths appear to be overdispersed, the variance is substantially larger than the mean. this is likely due to reporting of the data. the data presented at https://c19.se/ does not correct the reporting of death dates in hindsight. a comparison at the national level with data provided by folkhälsomyndigheten shows that the official records of the daily number of deaths for sweden does not appear to be overdispersed. nevertheless, even after smoothing the data from https://c19.se/ by a moving average over a few days, the results of the simulations remain essentially the same. figure 2 (bottom left/right) shows the posterior median (dark red) and 95% credibility interval (red) for the daily/cumulative number of infected individuals. although the parameter uncertainty has significant impact on the cumulative number of infected individuals, some conclusions are still possible. as of mid may, the cumulative number of infected individuals has almost reached its terminal value and the spread of the epidemic has slowed down significantly. the estimated cumulative number of infected individuals is 13.5% of the population in stockholm. the estimated number of infected individuals by 2020-04-11 is 10.7%, showing that these results are well in line with the reports of the anti-body test performed at kth 2 , which indicated that 10% of the population in stockholm had developed anti-bodies against the sars-cov-2 virus by the first weeks of april. we emphasize that the estimate of the cumulative number of infected individuals in stockholm relies heavily on the inclusion of results from the random sampling performed by folkhälsomyndigheten in late march, early april. without this crucial piece of information similar models to the one analyzed here may provide a significantly higher estimate on the cumulative number of infected. table 3 . marginal posterior median and credibility intervals for region stockholm. 6.2. summary of the results for ten regions of sweden. in this section estimates of the cumulative number of infected individuals are provided for the following regions of sweden: (1) stockholm (population: 2.34 · 10 6 ) (2) västra götaland (population: 1.71 · 10 6 ) (3)östergötland (population: 1.36 · 10 6 ) (4)örebro (population: 3.02 · 10 5 ) (5) skåne (population: 1.36 · 10 6 ) (6) jönköping (population: 3.56 · 10 5 ) (7) sörmland (population: 2.95 · 10 5 ) (8) västmanland (population: 2.74 · 10 5 ) (9) uppsala (population: 3.76 · 10 5 ) (10) dalarna (population: 2.87 · 10 5 ) the daily death counts for the regions of sweden until 2020-05-15 are obtained from the webpage: https://c19.se/. there is no random testing providing information on the proportion of infected individuals outside region stockholm. to estimate the contact rate and the cumulative number of infected individuals in regions outside stockholm, we have implemented the multi-region model pairwise, with two regions in each mcmc simulation, where one region is stockholm and the other region is from the list above. it is assumed that the parameters ν, γ, f, and s d are identical in both regions, but the time varying contact rate and the initial proportion of susceptible individuals are different between the regions. the posterior of the contact rates for the different regions are provided in figure table 4 along with 95% credibility intervals. overall the proportions are low, far from herd immunity. prop observed daily number of deaths (black dots), the posterior median (dark red) and 95% credibility interval for the expected daily number of deaths. top right: observed cumulative number of deaths (black dots), the posterior median (dark red) and 95% credibility interval for the expected cumulative number of deaths. bottom left: the posterior median (dark red) and 95% credibility interval for the daily number of infected individuals. bottom right: the posterior median (dark red) and 95% credibility interval for the cumulative number of infected individuals. infectious diseases of humans: dynamics and control the geometric foundations of hamiltonian monte carlo basic estimation-prediction techniques for covid-19, and a prediction for stockholm contemporary statistical inference for infectious disease models using stan mathematical tools for understanding infectious disease dynamics stochastic relaxation, gibbs distributions, and the bayesian restoration of images evaluating derivatives: principles and techniques of algorithmic differentiation monte carlo sampling methods using markov chains and their applications the no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo random measures early transmission dynamics in wuhan, china, of novel coronavirus infected pneumonia substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) equation of state calculations by fast computing machines estimates of the severity of coronavirus disease 2019: a model-based analysis the lancet infections diseases estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china key: cord-000757-bz66g9a0 authors: davis, kailah; staes, catherine; duncan, jeff; igo, sean; facelli, julio c title: identification of pneumonia and influenza deaths using the death certificate pipeline date: 2012-05-08 journal: bmc med inform decis mak doi: 10.1186/1472-6947-12-37 sha: doc_id: 757 cord_uid: bz66g9a0 background: death records are a rich source of data, which can be used to assist with public surveillance and/or decision support. however, to use this type of data for such purposes it has to be transformed into a coded format to make it computable. because the cause of death in the certificates is reported as free text, encoding the data is currently the single largest barrier of using death certificates for surveillance. therefore, the purpose of this study was to demonstrate the feasibility of using a pipeline, composed of a detection rule and a natural language processor, for the real time encoding of death certificates using the identification of pneumonia and influenza cases as an example and demonstrating that its accuracy is comparable to existing methods. results: a death certificates pipeline (dcp) was developed to automatically code death certificates and identify pneumonia and influenza cases. the pipeline used metamap to code death certificates from the utah department of health for the year 2008. the output of metamap was then accessed by detection rules which flagged pneumonia and influenza cases based on the centers of disease and control and prevention (cdc) case definition. the output from the dcp was compared with the current method used by the cdc and with a keyword search. recall, precision, positive predictive value and f-measure with respect to the cdc method were calculated for the two other methods considered here. the two different techniques compared here with the cdc method showed the following recall/ precision results: dcp: 0.998/0.98 and keyword searching: 0.96/0.96. the f-measure were 0.99 and 0.96 respectively (dcp and keyword searching). both the keyword and the dcp can run in interactive form with modest computer resources, but dcp showed superior performance. conclusion: the pipeline proposed here for coding death certificates and the detection of cases is feasible and can be extended to other conditions. this method provides an alternative that allows for coding free-text death certificates in real time that may increase its utilization not only in the public health domain but also for biomedical researchers and developers. trial registration: this study did not involved any clinical trials. the ongoing monitoring of mortality is crucial to detect and estimate the magnitude of deaths during epidemics, emergence of new diseases (for example, seasonal or pandemic influenza, aids, sars), and the impact of extreme environmental conditions on a population such as heat waves or other relevant public health events or threats [1, 2] . the surveillance of vital statistics is not a novel idea; mortality surveillance has played an integral part in public health since the london bills of mortality were devised in the seventeenth century [3] . the bills served as an early warning tool against bubonic plague by monitoring deaths from the 1635 to the 1830s. today, mortality surveillance continues to be a critical activity for public health agencies throughout the world [4] [5] [6] [7] . pneumonia and influenza are serious public health threats and are a cause of substantial morbidity and mortality worldwide; for instance, the world health organization (who) estimates seasonal influenza causes between 250,000 to 500,000 deaths worldwide each year [8] while pneumonia kills more than 4 million people worldwide every year [9] . worldwide, the morbidity and mortality of influenza and pneumonia have a considerable economic impact in the form of hospital and other health care costs. each year in the united states approximately 3 million persons acquire pneumonia and, depending on the severity of the influenza season, 15 to 61 million people in the us contract influenza [9] . these numbers contribute to approximately 1.3 million hospitalizations, of which 1.1 million are pneumonia cases [10] and the remainder for influenza [11] . moreover, pneumonia cases and influenza together cost the american economy 40.2 billion dollars in 2005 [12] . in the netherlands it has been estimated that influenza accounts for 3713 and 744 days of hospitalization per 100,000 highrisk and low-risk elderly, respectively [13] . due to the public health burden and the unpredictability of an influenza season, strong pneumonia and influenza surveillance systems are a priority for health authorities. mortality monitoring is an important tool for the surveillance of pneumonia and influenza which can aid in the rapid detection and estimates of excess deaths and inform and evaluate the effect of vaccination and control programs. traditionally, influenza mortality surveillance often uses the category of "pneumonia and influenza" (p-i) on death certificates as an indicator of the severity of an influenza season or to identify trends within a season; however, only a small proportion of these deaths are influenza related. it has been reported that only 8. [5] [6] [7] [8] [9] .8% of all pneumonia and influenza deaths are influenza related [14, 15] . the non-influenza-related pneumonia deaths tend to be stable from year to year and fluctuations in this category are largely driven by the prevalence and severity of seasonal influenza. as a result, the p-i category is an important sentinel indicator. in the us, death certificates are the primary data source for mortality surveillance whose findings are widely used to exemplify epidemics and measure the severity of influenza seasons [16] . currently, there are three systems to monitor influenza-related mortality; one system in particular, the 122 cities mortality reporting system, provides a rapid assessment of pneumonia and influenza mortality [6] . each week, this system summarizes the total number of death certificates filed in 122 us cities, as well as the number of deaths due to pneumonia and influenza. however, even these data can be delayed by approximately 2-3 weeks from the times of death. this delay can be attributed to one of the following reasons: 1) timeliness of death registration and 2) reviewing of the death certificates to identify pneumonia and influenza deaths [6, 16, 17] . the registration and reviewing of death certificates varies by states and, as a result, there is variability in length of time to report a death to cdc. for instance, states with paper-based death registration system typically perform manual reviews of the death certificates which can take up to 3 weeks; however states with electronic death registration systems (edrs) may perform automatic reviews which can decrease this time significantly. the current 122 cities mortality reporting system surveillance system also lacks flexibility for expanding the number of conditions and/or the geographic distribution. moreover, the unavailability of coded death records due to the complexity of the national center of health statistics (nchs) coding process results in multiple strategies to identify common outbreaks such as pneumonia and influenza deaths, which greatly vary by jurisdiction. to bypass the lengthy nchs process, a variety of approaches have been attempted that are close to 'realtime' but less than optimal. for instance, in utah keyword searching is used to identify pneumonia and influenza deaths; although this method is fast and easy to implement, it can easily result in the over or under estimation of cases. this can occur by missing cases due to misspelled terms, synonyms, variations, or the selection of strings containing the search term. other research groups [18, 19] have demonstrated the feasibility of using mortality data for real time surveillance but all used "free text" search for the string "pneumonia", "flu" or "influenza." as noted earlier, although this method can provide the semi quantitative measurements for disease surveillance purposes, keyword searches can also result in an array of problems that result from complexities of human language such as causal relationships and synonyms [20] . therefore, the lack of coded death data that may not be available for months [21] seriously limits the use of death records in automated systems. at this time, there is little published on the automatic assignment of codes to death certificates for automatic case detection. currently the coding of death certificates is a complex process which involves many entities. in the us, where we are focusing this study, the codes on death certificates that are generated by the national center for health statistics (nchs) depend on information reported on the death certificate by the medical examiner, coroner, or another certifier, and there is substantial variation in how certifiers interpret and adhere to causeof-death definitions [22] . the cause of death literals are coded into international classification of diseases tenth revision (icd-10) [23] and the underlying and multiplecause-of-death codes are selected based on the world health organization coding rules. these coding rules have been automated by cdc with the development the mortality medical data system (mmds) which consists of four programs: super mortality medical indexing classification and retrieval (supermicar) data entry; mortality medical indexing classification and retrieval (micar); automated classification of medical entities (acme) and transax (translation axes). supermi-car was designed to facilitate the entry of literal text of causes of death in death certificates and convert them into standardized expressions acceptable by micar. it contains a dictionary which assigns an entity reference number (ern) to statements on the death certificate. these erns are fed into micar200 which transforms the erns into icd-10 codes by using specific mortality coding rules; the rules require look-up files and a dictionary. acme and transax then selects the underlying and multiple causes of death respectively. icd-10 codes from micar200 are fed into acme which assigns the underlying cause of death using decision tables. the decision table contains all possible pairings of diseases for which the first disease can cause the second. in the latest version of the system, acme is comprised of eight decision tables including three tables of valid and invalid codes, causal relationships (general principle and rule 1), and direct sequel (rule 3), and three other tables needed by modification rules. figure 1 provides the workflow for the mmds system. of the 2.3 million deaths that occur each year 80-85 percent are automatically coded through super-micar, and the remaining records are then manually coded by nosologists, a medical classification specialist [24] ; this is a tedious and lengthy process lasting up to 3 months. although the automation process has decreased the time required for coding death data to 1-2 weeks, the national vital statistics data is not available for at least two years. therefore, local health department still manually code records or perform basic process techniques to quickly characterize disease patterns [25] . records that were processed through super-micar or were manually coded are then processed through the remaining components (micar200, acme and transax) of mmds. in 1999, micar200 had a throughput rate of 95-97%, while acme rate was 98 percent. moreover, based on a reliability study, acme error rate for selecting the underlying cause is at onehalf percent, while transax, the multiple cause codes had a one-half percent error rate [26] . due to the high processing rates and low error rates, mmds is considered by practitioners as the gold standard for the processing and coding of death certificates in the us and other countries (such as canada, the united kingdom (uk) and australia). therefore, we used the codes produced by this system as the "gold standard" when comparing with the methods developed here. in 1997, the us steering committee to reengineer the death registration process (a task force representing federal agencies, the national center for health statistics and the social security administration, and professional organizations representing funeral directors, physicians, medical examiners, coroners, hospitals, medical records professionals, and vital records and statistics officials (naphsis) published the report "toward an electronic death registration system in the united states: report of the steering committee to reengineer the death registration process." this report explained the feasibility of developing electronic death registration in the united states [27] and argued that these electronic death records have the potential to be an effective source of information for nation-wide tracking and detecting of disease outbreaks. however, little actions have been taken to implement such recommendations in a comprehensive manner. as of july 2011, electronic death registration systems were operating in 36 states, the district of colombia, and in development or planning stage in a dozen others [28] . information representing the 'cause of death' field on the death certificates is free text. one major goal of natural language processing (nlp) is to extract and encode data from free-texts. there have been many research groups developing nlp systems to aid in clinical research, decision support, quality assurance, the automation of encoding free text data and disease surveillance [29] [30] [31] . although, there have been a few nlp applications to the public health domain [32, 33] , little is known about its capability to automatically code death certificates for outbreak and disease surveillance. recently, medical match master (mmm) [25] , developed by riedl et al at the university of california davis, was used to match unstructured cause of death phrases to concepts and semantic types within the unified medical language system (umls). the system annotates each death phrase input with two types of information, the concept unique identifier, cui, and a semantic type both assigned by the umls. mmm was able to identify an exact concept identifier (cui) from the umls for over 50% of 'cause of death' phrases. although, the focus of this study was to use nlp techniques to process death certificates, the description of this system reported in the literature did not show how well coded data from an nlp tool along with predefined rules can detect countable cases for a specific disease or condition. the purpose of our project is to create a pipeline which automatically encodes death certificates using a nlp tool and identify deaths related to pneumonia and influenza which provides daily and/or weekly counts. we compared the new technique developed here with keyword searching and mmds as exemplars of the easiest possible approach and the current "gold standard", respectively. the comparison of the techniques was done by calculating recall, precision, f-measure, positive predictive value and agreement (cohen's kappa). we obtained 14,440 de-identified electronic death records all with multiple-cause-of-death from the utah department of health (udoh) for the period 1 january 2008 to 31 december 2008. the records included a section describing the disease or condition directly leading to death, and any antecedent causes, co-morbid conditions and other significant contributing conditions. an example of a paper and electronic death certificate are shown in figures 2 and 3 respectively. all death certificates used in this study have been processed using the mortality medical data system (mmds) and the record axis codes were received from udoh. for our study we randomly selected 6,450 (45%) records. all death records included in the study were previously also coded by nchs into icd-10, but this information was not used for our coding, it was only used as posteriori to assess to quality of the automatic coding. we chose to apply the centers of disease control and prevention case definition of pneumonia and influenza deaths defined by cdc's epidemiologist staff through personal communication. therefore, the operational definition for deaths from influenza includes deaths from all types of influenza with the exception of deaths from haemophilus influenzae infection and deaths from parainfluenzae virus infection. pneumonia deaths include deaths from all types of pneumonia including pneumonia due to h. influenza and pneumonia due to parainfluenzae virus. the exceptions include aspiration pneumonia (o74.0, o29, o89.0, j69.-and p24.-)1, pneumonitis (j84.1, j67-j70), and pneumonia due to pneumococcal meningitis (j13, g00.1) 1. pneumonia and influenza related deaths were defined as one of the diagnoses listed in table 1 which were reported in any cause of death field. these codes were selected through manual review of the icd-10 version 2007 manual [23] . the death certificates pipeline, dcp, was developed to identify pneumonia and influenza cases. the pipeline consisted of two components. the first component of the system was the natural language processor, for which we used metamap [34] , and the second component was the definitional rules that were applied to the output generated by metamap. the study procedures for this pipeline included: preprocessing, nlp, extraction of coded data and the detection of pneumonia and influenza cases (figure 4 ). spelling errors are common on death certificates; therefore, the death records were first processed through a spell checker to identify misspellings. although the umls sl has a spell suggestion tool called gspell [35] [36] [37] , we decided not to use it and chose to utilize aspell [38] . our motivation for this decision was based upon an evaluation which showed aspell outperforming gspell; aspell performed better on three areas of performance which were (2) whether the correct word was ranked in the top ten; and (3) whether the correct word was found at all [35] . perl (www.perl.org), a high-level computer programming language that aids in the manipulation and processing of large volume of text data was then used to prepare the cause of death free text for nlp. the preprocessing also involved the removal of non-ascii characters; this was a required technical step for metamap processing. step 2: natural language processing metamap was used to convert the electronic death records to coded descriptions appropriate for the rule based system. metamap [34] , developed by the national library of medicine (nlm), is useful in identifying biomedical concepts from free-form textual input and maps them into concepts from the unified medical language system (umls) metathesaurus [34, 39] . metamap works by breaking the inputted text into words or phrases, map them to standard terms, and then match the terms to concepts in the unified medical language system (umls) [40] . for each matched phrase, metamap classifies it into a semantic type then returns the concept unique identifier (cui) and the mapping options which are ranked according to the strength of the mapping. output from metamap. text bolded in the output from nlp represent the code and its corresponding phrase. step 3: extraction of coded data the data produced by metamap (xml format) was processed through a perl script to extract the inputted text and its corresponding meta-mapped cuis. this extracted data was outputted to a text document. step 4: identification of p-i deaths the identification of pneumonia and influenza cases involved two steps: 1) identifying cuis relating to pneumonia and influenza and 2) use of the cuis to create a rules based algorithm to identify cases. details of each step are explained in the following paragraphs. to determine which cui codes were relevant for identifying pneumonia and influenza deaths it was necessary to create a "cui code list" that represents all the icd-10 codes of interest (see table 1 ). to create this list, we generated a subset of the umls 2010 ab database [41] using the metamorphosys [40] tool provided by the national library of medicine, nlm. the umls database includes many vocabularies, therefore, to determine which vocabularies are relevant to our aims we used the procedure used by riedl three queries were performed on the subset described above to map pneumonia and influenza icd-10 codes to cuis and identify related pneumonia and influenza concepts. each query was then placed in a separate database, all duplicates were removed and a sub-query was run to ensure that only the icd-10 codes in table 1 were included in this list. this produced 241 distinct concept identifiers (cuis) relating to pneumonia or influenza. these codes were used to develop the rules to identify the cases of interest. the coded data produced by metamap was accessed by rules, aimed at identifying the presence of pneumonia and influenza based on the coded data. the rules for identifying these deaths used the cui code list described above. the rule looks at each cause of death field (underlying cause, additional causes, etc.) to flag records with relevant codes. these rules used boolean operators (and, or, not) and if-then statements to create a chain of rules ( figure 5 ). the list of cases identified by our automated detection system was compared with those identified by two other methods: a) keyword searching and b) the reference standard: the icd-10 codes given by the cdc mmds method. for key-word searching we followed the process to evaluate the performance of both techniques against the reference standard, we needed to specify what constituted a match. each death record is associated to a unique number; therefore, we considered a match if the unique identifier was identified by the comparator and also found by the reference standard. three standard measures were used to evaluate the performance of one method in relation to the reference standard used in this study: precision (equivalent to positive predictive value; recall (equivalent to sensitivity or true positive rate), and f-measure. kappa statistics were used to assess agreement and mcnemar's test was used to analyze the significance between the two methods. all calculations were performed in r [42] . to calculate these values, pneumonia and influenza related deaths were examined by comparing the reference standard output vs. the two comparators: dcp and keyword search. for both comparators, the deaths were counted and categorized as true posi-tives (cases found by the comparator-pneumonia deaths being correctly classified); false positives (incorrect cases found by the comparator-the number of pneumonia and influenza deaths incorrectly identified by the comparator); false negatives (correct cases not found by the comparator-the number of pneumonia deaths not identified by the comparator). precision, recall and f-score were calculated as follows: precision = true positives/(true positives + false positives) (1) recall = true positives/(true positives + false negatives) (2) f-measure = 2 *(p r/ p + r) (3) mcnemar's test was also calculated to evaluate the significance of the difference between the two comparators. to calculate this value a confusion matrix was created where a is the number of times both methods have correct predictions; b is the number of times method 1 has a correct prediction and method 2 has a wrong prediction; c is the number of times method 2 has a correct prediction and method 1 has a wrong prediction; d is the number of times both methods have incorrect predictions. ethics approval was not required for this study. identifying variables that could be used for re-identifying individuals were excluded from the study data. the records were processed and analyzed on a server with two opteron dual-core 2.8 ghz processors and 16 gb ram at the center of high performance computing at the university of utah. using keyword searching the cpu processing time to identify pneumonia and influenza cases was 0.21 seconds and the wall time was 0.37 seconds. for the dcp, the total cpu processing time was 881.83 seconds. the nlp portion of the pipeline attributed to 99.4 percent of the processing time (nlp-877 seconds). while the dcp execution time is much longer, still it is well within the "in real time" realm. for instance, it would take 6,364.3 seconds cpu time seconds for dcp to code and flag all the weekly death records of the us ( 46,523). recall and precision were calculated at a 0.95 confidence intervals; the f-measure was also calculated. the performance of each method is described below. of the 6,450 records analyzed keyword search identified 473 records as pneumonia and influenza deaths, 21 being identified as false positives. precision for keyword searching was calculated at 96%. of the 21 false positives, 6 records correctly mentioned pneumonia in the cause of death text but their corresponding icd-10 codes failed to provide any code related to pneumonia, while 2 records were flagged because it included the sub-string "pneumonia" in the additional cause of death field. the death literal for these two records were "bacteremia due to streptococcus pneumonia" and "streptococcal pneumoniae septicemia", the remaining 13 errors were due to the entry of the death literals; in all cases the negation of 'aspiration pneumonia' either due to: 1) 'pneumonia' being in a separate cause of death field to 'aspiration' or 2) 'pneumonia' not being directly followed by 'aspiration' in the death text (example "pneumonia due to secondary aspiration"). a total of 20 false negatives were recorded, yielding a recall of 96%. the false negatives could be generalized into two categories: 1) misspellings of pneumonia on the death certificated (n = 8) and 2) appropriate pneumonia or influenza icd-10 code was coded but the death literals did not mention an appropriate scanned phrase (n = 12). f-measure was also calculated at 96%. a high level of agreement was seen among keyword searching and the reference standard (kappa 0.95). utilizing the death certificates pipeline (dcp), we identified 481 records as pneumonia and influenza deaths, 9 of which were false positives. the precision for this method was calculated at 98%. like the keyword searching method, of the 9 false positives, 6 records mentioned pneumonia in the cause of death field but their corresponding icd-10 codes failed to provide any code related to pneumonia and the remaining errors were due to the reporting of aspiration pneumonia on the death certificate. this method had only 1 false negative for the death literal stating "recurrent aspiration with pneumonia", thus yielding a recall at 99.8%, being less than keyword searching. f-measure was calculated at 99%. the level of agreement between the pipeline and the gold standard was almost perfect with a cohen's kappa of 0.988. the precision and recall scores that are reported above suggest that the dcp is a better method for identifying pneumonia and influenza deaths than keywordsearching. therefore, we investigated if this observation is supported by statistical analysis. performing a fisher's exact test at î± = 0.05, significant difference was seen for both recall (p = 1.742e-05) and precision (p = 0.026). the mcnemar's test result also showed dcp to be a better method with a p-value = 2.152e-05. for the 472 pneumonia and influenza cases found by the reference standard, dcp correctly identified 471 cases, missed one case and incorrectly flagged nine cases. most failures were due to discrepancies between the death literal and its respective icd-10 code. for the only case which the pipeline did not match, the phrase 'recurrent aspiration with pneumonia' was present in the death literal. metamap coded this literal as aspiration pneumonia which was excluded from the cui code list, but its respective icd-10 included j189. for the 9 additional cases which were not present in the reference standard, we noticed two categories of errors: 1)cases where the string 'pneumonia' is present in the death literal but not coded into icd-10 and 2) the reporting of aspiration pneumonia on the death certificate. the first category of errors was not due to metamap or the rule algorithm, but perhaps due to the coding process. as described earlier, mmds produces entity axis and record axis codes. the entity axis codes would be a more appropriate reference standard for they provide the icd 10 codes for the conditions or events reported as listed by the death certifier and maintains the order as written on the death certificate [43] ; but as noted earlier only the record axis codes were made available for this study. the algorithm used to produce record axis codes from the entity axis data removes duplicate codes and contradictory diagnoses within the entity axis data to produce the more standardized record axis [44] . for example, if a medical examiner reports pneumonia with chronic obstructive pulmonary disease both conditions will be shown in entity axis code data. however, in record axis code data, they will be replaced with a single condition: chronic obstructive pulmonary disease with acute lower respiratory infection (j44.0). we were unable to verify that codes related to pneumonia were present in the entity axis codes for the six cases; therefore, we can only speculate the reason for this failure. the second category of errors was due to the reporting of aspiration pneumonia on the death certificate. in cases where the string "aspiration" and pneumonia" were not reported in the same text field metamap processed the string separately thus yielding two codes: one for aspiration and the other pneumonia, instead of one code for "aspiration pneumonia" [c0032290]. in an initial review of metamap we found metamap had difficulties processing the phrase "pneumonia secondary to acute aspiration", therefore, our rule detection algorithm excluded cases where the code for pneumonia and aspiration were present in the same text field. to our knowledge, this is the first published report on using a natural language processing tool and the umls to identify pneumonia and influenza deaths from death certificates. we found that automated coding and identification of pneumonia and influenza deaths is possible and computationally efficient. the death certificates pipeline developed here was statistically different to keyword searching and has higher recall and precision when compared to the current semi-automatic methods in use by the cdc. a good recall is required to help capture the 'true' p-i deaths and a good precision is needed to avoidoverestimating the number of p-i deaths. this study also indicated that keyword searching underestimated pneumonia and influenza deaths in utah. the simple keyword search method not only decreased recall and precision but also reduced the level of agreement. when reporting counts for surveillance purposes it's best to be as accurate as possible; however, there's a trade-off between recall and precision. for disease surveillance, increased precision enables public health officials to more accurately focus resources for control and prevention, therefore, although both methods had good precision the pipeline developed would be more advantageous to utilize. metamap did an excellent job at extracting cause of deaths from free-form text which is consistent with the results of reid et al [25] . most of the concepts were present in the umls which attributed good recall. both recall and precision depended on the comprehensiveness of the cui code list. the performance of this system is determined largely by the coverage of terms and sources in the umls. both keyword searching and the system's weakest point is its lack of precision. most of the concepts the system did not identify had either the aspiration text in another field or pneumonia was mentioned in the cause of death text but not coded (9 cases fit these criteria). the sample size was sufficient to show difference between the two methods. it is important to note that utilizing trained nosologists, who would manually code the death certificates, would have developed an absolute gold standard which may or may not be a better reference standard than icd-10 codes. however, our motivation for utilizing icd codes was influenced due to the fact that the use of icd codes to identify all-cause pneumonia has been examined and has showed to be a valid tool for the identification of these cases [45, 46] . in terms of timing, while keyword searching is faster than the dcp, our method is also sub 1/10 second range, which implies that it is possible to process the daily utah deaths (~40) in approximately 5.47 seconds and all deaths in the us (~6646) in approximately 909.17 seconds using current hardware. this timing would be much faster than the minimum of two weeks to receive the coded data from the current cdc process. moreover, these timings make it apparent that this system can be integrated in a real time surveillance system without introducing any additional bottlenecks. there are several potential limitations with this analysis. first, the generalizability of the findings is limited because the death records were only from one institution. although death certificates have a standardized format, the death registration process and the reviewing of death records differ by institutions. udoh utilizes keyword searching to identify pneumonia and influenza cases, other institutions may use more accurate (manual review) or less accurate methods for finding cases. second, a separate evaluation of the nlp component of the dcp was not performed. further research is needed to examine the use of nlp on electronic death records across institutions and countries which may have different documentation procedures. this study shows that it is feasible to achieve high levels of accuracy when using nlp tools to identify cases of pneumonia and influenza cases from electronic death records while still providing a system that can be used for real time coding of death certificates. identification of concept identifiers related to the cdc's case definition of pneumonia and influenza was very important in producing a highly accurate rule for the identification of these cases. future work will aim to improve the preprocessing phase of the pipeline by providing the inclusion of the spellchecker used by the cdc's mortality medical data system. future work will also involve evaluating the flexibility (e.g. identification of different diseases) of the system to deploy the pipeline tool, along with other public health related analytical tools, as a grid service to provide to real time public health surveillance tool that uses data and services under the control of different administrative domains. we have shown that it is feasible to automate the coding of electronic death records for real-time surveillance of deaths of public health concern. the performance of the pipeline outperformed the performance of current methods, keyword searching, in the identification of pneumonia and influenza related deaths from death certificates. therefore, the pipeline has the potential to aid in the encoding of death certificates and is flexible to identify deaths due to other conditions of interest as the need arises. participants of a workshop on mortality monitoring in europe: monitoring excess mortality for public health action: potential for a future european network public health surveillance: historical origins, methods and evaluation communicable disease surveillance the new automated daily mortality surveillance system in portugal description of a new all cause mortality surveillance system in sweden as a warning system using threshold detection algorithms a method for timely assessment of influenza-associated mortality in the united states mortality surveillance 1968-1976, england and wales. deaths and rates by sex and age group for 8th revision causes, a-list and chapters. london: great britain office of population census and surveys prevention and control of seasonal influenza with vaccines: recommendations of the advisory committee on immunization practices (acip) national hospital discharge survey: 2007 summary influenza-associated hospitalizations in the united states american lung association state of lung disease in diverse communities economic evaluation of influenza vaccination. assessment for the netherlands mortality associated with influenza and respiratory syncytial virus in the united states estimating seasonal influenza-associated deaths in the united states: cdc study confirms variability of flu flu activity & surveillance: reports and surveillance methods in the united states real-time surveillance of pneumonia and influenza mortalities via the national death certificate system prospective surveillance of excess mortality due to influenza in new south wales: feasibility and statistical approach medical language processing: computer management of narrative data death certificate comparability of cause of death between icd-9 and icd-10: preliminary estimates world health organization: international statistical classification of disease and related health problems, tenth revision version for selected data editing procedures in an automated multiple cause of death coding system using the umls and simple statistical methods to semantically categorize causes of death on death certificates description of the national center for health statistics software systems and demonstrations toward an electronic death registration system in the united states: report of the steering committee to reengineer the death registration process national association for public health statistics and information systems classifying free-text triage chief complaints into syndromic categories with natural language processing automatic detection of acute bacterial pneumonia from chest x-ray reports automated encoding of clinical documents based on natural language processing using nlp on va electronic medical records to facilitate epidemiologic case investigations evaluating natural language processing applications applied to outbreak and disease surveillance a: effective mapping of biomedical text to the umls metathesaurus: the metamap program a frequency-based technique to improve the spelling suggestion rank in medical queries lexical systems: a report to the board of scientific counselors umls language and vocabulary tools metamap: mapping text to the umls metathesaurus the unified language system (umls): integrating biomedical terminology umls distribution team: r: a language and environment for statistical computing. vienna: r foundation for statistical computing entity axis codes documentation of the mortality tape file for community-acquired pneumonia: can it be defined with claims data? icd-10 codes are a valid tool for identification of pneumonia in hospitalized patients aged > or = 65 years identification of pneumonia and influenza deaths using the death certificate pipeline this study has been supported in part by the grants from the national library of medicine (lm007124) and from the centers of disease control and prevention center of excellence (ip01hk000069-10). the authors declare that they have no competing interests. all the authors contributed equally to this research. all authors read and approved the final manuscript.submit your next manuscript to biomed central and take full advantage of: key: cord-027578-yapmcvps authors: menzies, rachel e.; menzies, ross g. title: death anxiety in the time of covid-19: theoretical explanations and clinical implications date: 2020-06-11 journal: cogn behav therap doi: 10.1017/s1754470x20000215 sha: doc_id: 27578 cord_uid: yapmcvps the recent covid-19 pandemic has triggered a surge in anxiety across the globe. much of the public’s behavioural and emotional response to the virus can be understood through the framework of terror management theory, which proposes that fear of death drives much of human behaviour. in the context of the current pandemic, death anxiety, a recently proposed transdiagnostic construct, appears especially relevant. fear of death has recently been shown to predict not only anxiety related to covid-19, but also to play a causal role in various mental health conditions. given this, it is argued that treatment programmes in mental health may need to broaden their focus to directly target the dread of death. notably, cognitive behavioural therapy (cbt) has been shown to produce significant reductions in death anxiety. as such, it is possible that complementing current treatments with specific cbt techniques addressing fears of death may ensure enhanced long-term symptom reduction. further research is essential in order to examine whether treating death anxiety will indeed improve long-term outcomes, and prevent the emergence of future disorders in vulnerable populations. key learning aims: (1).. to understand terror management theory and its theoretical explanation of death anxiety in the context of covid-19. (2).. to understand the transdiagnostic role of death anxiety in mental health disorders. (3).. to understand current treatment approaches for directly targeting death anxiety, and the importance of doing so to improve long-term treatment outcomes. in december 2019, a novel coronavirus was first detected in the city of wuhan, china. within five weeks, the virus, now named covid-19, began to dominate global headlines. by mid-may 2020, covid-19 had resulted in the deaths of more than 300,000 people worldwide, with nearly 4.5 million cases confirmed (world health organization, 2020). as cases increased, governments around the world began closing borders, and introducing social distancing restrictions and lockdown orders, in an effort to slow the rapid acceleration of the virus. prior to many of these government responses, reports emerged of individuals choosing to self-isolate, as mass panic swept through communities in waves. anecdotal reports of verbal and physical aggression in grocery stores, hoarding of antibacterial products and other supplies, and racist abuse of individuals with asian appearance increased as fear took over across the world (devakumar et al., 2020; garfin et al., 2020) . as individuals scrambled to prevent the threat of covid-19 in any way they could, online sales of 'immune boosters' and untrialled medicines increased. analyses of google data across just 14 days in march 2020 revealed a total of 216,000 searches for where to purchase chloroquine and hydroxychloroquine, two drugs which were touted by the media as potentially effective, despite the existing clinical evidence for the efficacy of these drugs being inconclusive (liu et al., 2020) . emerging research data are already revealing high levels of anxiety concerning the virus, with findings from nearly 5000 participants suggesting that greater perceived severity of the virus is associated with poorer mental health outcomes (li et al., 2020) . arguably, this response from the public should not come as a surprise. fears of death have been proposed to be a central and universal part of the experience of being human (becker, 1973) . in fact, there is evidence of humans grappling with death anxiety for as long as our species has been recording its history (menzies, 2018b) . we are the only species that we know of that has the cognitive capacity to contemplate and anticipate our own death, yet this impressive ability comes with a downside; we are destined to live our lives 'forever shadowed by the knowledge that we will grow, blossom, and inevitably, diminish and die' (yalom, 2008, p. 1 ). on the one hand, people may develop adaptive ways of coping with their fear of death, such as building meaningful relationships and leaving a positive legacy (yalom, 2008) . on the other hand, awareness of death may also produce a powerful sense of fear or meaninglessness, and may drive a number of maladaptive coping behaviours (menzies, 2012) . some of these behaviours (e.g. avoidance) may underlie numerous mental health conditions, while other behaviours may appear, on the surface, not directly linked to death at all. how might our fears of death be shaping our everyday behaviour in ways that we are not even aware of? terror management theory terror management theory (tmt), a social psychological theory based on the work of cultural anthropologist ernest becker, is the leading psychological framework for explaining this effect of death fears on human behaviour (greenberg et al., 1992) . tmt posits that our awareness of our own death produces a crippling terror, and that humans have developed two distinct buffers in order to allay this fear: cultural worldviews, and self-esteem. cultural worldviews involve shared symbolic concepts of the world, including identifying with cultural values or endorsing belief systems, such as the belief in an afterlife. sharing these cultural worldviews is thought to offer a sense of 'symbolic immortality', by giving an individual a sense of permanence and meaning in the face of death. secondly, self-esteem, gained through fulfilling the expectations of our cultural worldview, is also said to buffer death anxiety, by making one feel like a valuable member of their culture, who will be remembered after death (greenberg, 2012) . tmt also proposes that humans use different defence mechanisms depending on whether thoughts of death are within or outside of conscious awareness. according to this 'dual process model', when thoughts of death are conscious, we engage in 'proximal defences', which include suppressing these thoughts (e.g. turning off a news report about covid-19 death tolls), denying one's vulnerability (e.g. 'i'm not in a high risk group, so i'll probably be fine'), or trying to prevent death (e.g. cleaning down all home surfaces with antibacterial wipes) . on the other hand, when thoughts of death leave conscious awareness, we instead engage in 'distal defences', which involve bolstering our two buffers (e.g. by endorsing our cultural worldviews, or enhancing our self-esteem). findings from hundreds of studies have demonstrated support for tmt (burke et al., 2010) . primarily, these studies have involved a 'mortality salience' design, in which participants in one condition are reminded of their mortality, while participants in the control condition are reminded of an aversive topic that is unrelated to death. these studies have shown that reminders of death drive a vast array of human behaviours, including intention to purchase products (dar-nimrod, 2012) , driving behaviour (taubman-ben-ari et al., 1999) , and even suntanning (routledge et al., 2004) . despite some recent studies questioning the replicability of tmt results (e.g. klein et al., 2019) , follow-up studies have demonstrated that classic tmt findings do replicate when sufficiently powered (chatard et al., 2020) . in addition, burke et al.'s (2010) review of 277 tmt experiments found that death reminders yielded moderate effects on a range of behavioural variables, with little evidence of publication bias, further highlighting the strength of mortality salience effects. given this, what role might death anxiety be playing in the current pandemic? death anxiety and the covid-19 pandemic with the exception of a handful of studies, the majority of tmt research has been conducted under laboratory conditions; i.e. for those in the mortality salience condition, death is usually primed in the form of two short questions about one's death, which participants are asked to respond to. covid-19 offers an unusual scenario, in which mortality is made salient nearly constantly, given the daily updates on death tolls from the news and social media, and ubiquitous visible death cues in the form of face masks, anti-bacterial sprays and wipes, social distancing and public health campaigns. supporting this idea, laboratory findings have demonstrated that reflecting on current epidemics or virus outbreaks (e.g. ebola, swine flu) produces comparable findings to standard mortality salience primes, increasing the accessibility of death-related thoughts, and increasing defensive behaviour (e.g. arrowood et al., 2017; bélanger et al., 2013; van tongeren et al., 2016) . although it is currently unknown what the long-term effects of mortality salience primes are, the consequence on human behaviour of even minor, subtle reminders of death under laboratory conditions have much to tell us about the behaviours observed during the current pandemic. first, from this perspective, the observed reports of both covert and overt racism towards asian individuals are unsurprising. these observations are supported by a recent study that found a positive relationship between coronavirus-related anxiety and avoidance of chinese food and products (lee, 2020) , echoing similar observations of avoidance of chinese people following the 2003 sars outbreak (keil and ali, 2006) . these experiences offer a real-world confirmation of the tmt laboratory findings that reminders of death lead people to feel more hostile towards those of different cultural backgrounds to their own, as they are seen as a threat to one's own worldviews. findings across a number of studies reveal that reminders of death increase stereotypical thinking about people of other races (schimel et al., 1999) , increase aggression against those who criticise one's nation (mcgregor et al., 1998) , and lead white participants to hold more favourable reactions to white pride advocates (greenberg et al., 2001) . one study even found that germans interviewed in front of a cemetery reported strongly preferring german products over foreign products, whereas germans interviewed in front of a shop did not show this preference (jonas et al., 2005) . similar effects have been observed in more than 12 countries worldwide (greenberg and kosloff, 2008) . so, much of the recent upsurge in xenophobia, or even hostility towards those with different political views, can be explained by the tmt notion that bolstering our cultural worldviews, and aggressing against those that threaten them, are one means of gaining a sense of symbolic immortality. this idea is further supported by the recent observation of mutual discrimination between east asian societies in the midst of the pandemic (e.g. individuals in taiwan avoiding contact with koreans and japanese individuals; lin, 2020) . whilst the bolstering of one's cultural worldviews is an example of distal defences being engaged during the pandemic, proximal defences, in the form of attempts to ward off death (e.g. spikes in purchases of hydroxychloroquine, a drug falsely touted as a cure to the virus) or denial have also been observed (jong-fast, 2020). furthermore, despite the unsurprising recency of much of the research, some preliminary data support the idea that death anxiety may be driving a significant amount of psychological distress during this pandemic. evaluation of the psychometric properties of the fear of covid-19 scale revealed that the item 'i am afraid of losing my life because of coronavirus-19' had the highest factor loading, suggesting that one's worry about one's own fatality risk is highly predictive of broad fears of the virus (ahorsu et al., 2020) . data from 1210 residents of china revealed that estimates of fatality also appear to specifically predict their psychological distress, with low estimates of one's own survival from covid-19 predicting greater levels of stress and depression on the depression, anxiety and stress scale dass-21 (wang et al., 2020) . one large study of 810 australians specifically explored fears of death in the context of the pandemic (newton-john et al., 2020). the findings revealed a significant positive correlation between death anxiety and anxious beliefs and behaviours related to covid-19 (e.g. estimated likelihood of contracting the virus, estimated likelihood of wearing a mask in public, etc.), in addition to self-reported health anxiety, and overall psychological distress. furthermore, participant responses to items assessing beliefs surrounding the virus indicated a heightened perception of threat. for example, when participants were asked how likely they would be to die if they contracted covid-19 in the next 18 months, the mean likelihood estimate was 22%, a figure more than 11 times the actual australian case fatality rate of <2%. so, while death anxiety may indeed be a driving factor in everyday human behaviour, it appears more relevant than ever in the context of the current pandemic. covid-19 may be understood as a real-life and ever-present mortality salience prime, influencing people's behaviour in ways they may not even be consciously aware of. early findings suggest that fears of death predict anxiety about the virus, which in turn is shown to predict broader psychological distress. these findings may suggest a causal relationship between death anxiety and psychological distress, and this relationship may be exacerbated in the current pandemic. death anxiety has been proposed to be a transdiagnostic construct, underpinning a range of different mental health conditions (iverach et al., 2014) . for instance, fears of death may manifest in the frequent reassurance seeking from doctors, checking of one's body, and requests for medical testing seen in the somatic symptom-related disorders (furer et al., 2007) . in a similar vein, panic disorder often features worries about heart attacks during panic attacks, in addition to repeated appointments with cardiac specialists to allay these concerns (starcevic, 2007) . specific phobias have been argued to have death anxiety at their core for over a century (kingman, 1928) , with all of the common phobic objects having the potential to directly result in death (e.g. fears of spiders, snakes, flying and heights). fears of death have also been argued to play a central role in various presentations of obsessive compulsive disorder, as clients attempt to prevent death by illness (in the contamination subtype), household fire or electrocution (in compulsive checking), and death to oneself or another due to acting on intrusive thoughts (as seen in aggressive obsessions) (menzies and dar-nimrod, 2017; menzies et al., 2015) . existential concerns have also been argued to play a role in the depressive disorders, with concerns surrounding death and meaninglessness being a common theme (ghaemi, 2007; simon et al., 1998) . a number of studies have demonstrated significant relationships between self-reported death anxiety and symptomology of various disorders, including separation anxiety (caras, 1995) , hypochondriasis (noyes et al., 2002) , post-traumatic stress disorder (martz, 2004) , depression (ongider and eyuboglu, 2013) and eating disorders (le marne and harris, 2016). results from one large clinical sample found significant and positive correlations between death anxiety and number of lifetime mental health diagnoses, number of medications for mental health, dass-21 depression, anxiety and stress scores, as well as the symptom severity of 12 different disorders . notably, these relationships remained significant after controlling for neuroticism, suggesting the unique role of death anxiety in psychopathology. while limited conclusions regarding causality can be drawn from such correlational designs, a handful of studies have explored the causal role of death anxiety in mental illnesses using a mortality salience design. these have revealed that reminders of death increase avoidance of spider-related stimuli among spider phobics (strachan et al., 2007) , social avoidance (strachan et al., 2007) and attentional biases towards threat among the socially anxious (finch et al., 2016) , and even restricted consumption of high caloric foods amongst women, suggesting the relevance of death anxiety in eating disorders (goldenberg et al., 2005) . while few studies have used clinical samples, one study investigated the effect of mortality salience on compulsive handwashing, utilising a large sample of treatment-seeking individuals diagnosed with ocd (menzies and dar-nimrod, 2017) . participants were first primed with either death or a control topic. following a short delay to allow the effects of the prime to become unconscious, they were asked to wash their hands. the findings revealed that reminders of death doubled the time spent handwashing. notably, this increase in handwashing occurred despite no difference in reported anxiety or perceptions of cleanliness. results from another recent mortality salience design appear particularly relevant to the current pandemic. across a sample of participants with panic disorder or a somatic symptom-related disorder, reminders of death were shown to increase time spent checking one's body for physical symptoms, increase perceived threat of one's symptoms, and also increase intention to visit a medical specialist in the near future . these findings suggest that death anxiety drives relevant anxious behaviour for those vulnerable to health-related worries. results from numerous studies appear to suggest that fear of death is indeed a transdiagnostic construct driving a number of mental health conditions, although further research using treatment-seeking and clinical samples is clearly warranted. if death anxiety does underlie numerous disorders, this may explain the 'revolving door' phenomenon often observed in clinical practice, in which an individual receives apparently successful treatment for one disorder, only to present with a distinctly different disorder at a later time point (iverach et al., 2014, p. 590) . if death anxiety is indeed 'the worm at the core' (james, 1985, p. 119) of the human psyche, then failing to treat it may result in individuals continuing to present with different mental health conditions at various points across their lifespan. fear of death may need to be assessed and explicitly targeted in treatment in order to achieve long-term amelioration in symptoms and foster ongoing client wellbeing. as with any target of clinical treatments, a thorough assessment paves the way to the most effective treatments of death anxiety, tailored to the individual's unique needs. the clinical interview in early sessions should focus on exploring the topic of death, including assessing for any early losses, memories, or experiences associated with death (menzies and veale, 2020) . it is also essential to assess the individual's specific worries or thoughts about death, as these can vary largely between individuals. for example, worries may revolve around the dying process itself (e.g. pain or loss of cognitive capacities), the feared death of a loved one, fears concerning eternal punishment in the afterlife, uncertainty surrounding life after death, or non-existence itself, and each theme may need to be addressed using distinctly different lines of cognitive challenging. maladaptive behaviours the individual engages in should also be identified during the assessment stage, including any avoidance behaviours (e.g. avoiding the news, hospitals, flying or driving, or suppressing thoughts around death), reassurance seeking (e.g. from family or one's doctor), self-medicating, or compensatory behaviours (e.g. excessive exercise) (menzies and veale, 2020) . in the context of the current pandemic, it would be important to distinguish behaviours which are adaptive (i.e. behaviours generally recommended by health professionals and public officials, such as wearing a face mask when leaving the house, self-isolating when symptomatic, and regularly washing one's hands for recommended durations) compared with those that are maladaptive (i.e. behaviours that are not in line with standard recommendations and disrupt the individual's life, such as washing one's hands for hours each day, or requesting repeated medical tests for the virus despite lack of symptoms). alongside a standard clinical interview, questionnaires can prove useful in measuring severity of death fears, as well as tracking change following treatment. one recent systematic review of death anxiety measures revealed that there is a strong need for rigorous measures which have been validated in clinical samples, and that many measures in this field lack adequate psychometric properties (zuccala et al., 2019) . despite this, a number of measures may prove particularly useful in assessing death anxiety. these include the collett-lester fear of death scale-revised (lester, 1990) , which has been demonstrated to be responsive to treatment effects, and thus appears to be the best choice for exploring clinical change, and the multidimensional fear of death scale (hoelter, 1979) , for which means for various clinical groups have been reported . the death attitude profile-revised (wong et al., 1994) may also offer clinical utility, due to its unique assessment of adaptive attitudes, such as three distinct types of death acceptance, which have been shown to predict more positive outcomes (tomer and eliason, 2000) . despite being understood to be an 'existential given' (yalom, 1980) , empirical findings fortunately indicate that death anxiety can indeed be ameliorated. one recent meta-analysis examined the effects of randomised controlled trials on death anxiety (menzies et al., 2018) . this revealed that psychosocial interventions produced significant reductions in death anxiety relative to control conditions. notably, this effect was found to be driven by cognitive behaviour therapy (cbt) interventions, which produced significantly greater improvements in death fears compared with other treatment modalities. in particular, cbt treatments centring on graded exposure therapy were found to be most effective. in fact, alternative treatment options examined by the meta-analysis failed to produce any significant change in death anxiety scores (menzies et al., 2018) . given these meta-analytic findings, cbt appears to be the most appropriate treatment for addressing death anxiety, and various techniques for doing so have been proposed (see further, menzies, 2018a; menzies and veale, 2020) . a number of exposure therapy tasks have been recommended in order to ameliorate death anxiety. of course, as with any exposure tasks, these should be specifically tailored to the individual's own unique pattern of avoidance, and situations or themes that the individual has systematically avoided should be prioritised. one exposure task that can be tailored to the individual's specific concerns is that of an 'illness story', recommended by furer et al. (2007) . this involves writing a vivid description of the death of oneself or a loved one, starting with the events leading up to the death (e.g. the initial diagnosis of a terminal illness), progressing to the death itself, followed by the imagined funeral and aftermath. a similar task is popularised by acceptance and commitment therapy, which involves vividly imagining one's own funeral, and writing one's own eulogy and a tombstone inscription (hayes and smith, 2005) . other exposure tasks may involve visiting places associated with death that the client has avoided, such as hospitals, nursing homes, cemeteries or funeral homes. reading obituaries online or in the newspaper may also offer valuable exposure opportunities, and clients should be encouraged to deliberately seek out those who have died around their own age (furer et al., 2007) . preparing one's will, or having discussions regarding end-of-life preferences, may also be considered as exposure tasks, and may serve the additional benefit of increasing the individual's sense of control over their death (furer et al., 2007; henderson, 1990) . books (e.g. when breath becomes air by paul kalanithi), films (e.g. blade runner, up), television shows (e.g. after life) and music (e.g. all things must pass by george harrison) related to death may all offer valuable and powerful opportunities for exposure, in addition to helping to normalise death. two thousand years ago, the stoic philosophers of ancient greece observed that 'it is not things themselves that trouble people, but their opinions about things' (epictetus, 2018, p. 11 ). this principle lies at the heart of both stoic philosophy (which emphasised the need to accept death as a universal event outside of our control) and cbt. all of us hold an array of beliefs surrounding death, which may fluctuate between being adaptive (e.g. the belief that we would ultimately cope with the death of a loved one) or maladaptive (e.g. the belief that dying will inherently involve pain and suffering). beliefs of this latter type will understandably cause distress for many individuals, and should be explicitly identified and challenged in therapy. for example, in the context of covid-19, the distress of some individuals will be grounded on over-estimating the probability of death from the virus; over-estimates of the fatality risk are commonplace (newton-john et al., 2020) . however, while standard treatments for anxiety may often involve disproving the client's probability estimates (kirk and rouf, 2004) , this is not recommended in treatments targeting death anxiety. disproving the individual's estimate of dying from any one particular cause (e.g. falling to one's death, dying in a plane crash, or succumbing to covid-19) only serves to address the proximal threat, and will probably do little to address their fear of their own inevitable death, from one cause or another. as such, it is central instead to focus on addressing the cost of death, rather than merely the probability. clients should be guided to cultivating an attitude of 'neutral acceptance' towards death; that is, an acceptance of death as a universal fact outside of one's control, and therefore neither good nor bad (wong et al., 1994) . standard cognitive challenging techniques can also be used to challenge unrealistic beliefs surrounding death. for example, for individuals fearing pain associated with dying, corrective information may be provided in the form of information from palliative care, and research indicating that dying is less unpleasant than people typically imagine. notably, theoretical orientations outside of standard cbt may also prove valuable in shifting clients' attitudes towards death. approaches from existential psychotherapy may be particularly relevant, and yalom (1980) outlines many relevant treatment recommendations from an existential lens. for example, for clients who express anxiety surrounding the concept of nonexistence, yalom (2008) recommends the use of the stoic 'symmetry' argument, which proposes that humans have already experienced non-existence, that is, prior to their birth. that is, death 'returns us to that peace in which we reposed before we were born. if someone pities the dead, let him also pity those not yet born' (seneca, 2018) . these clients may also be encouraged to foster gratitude for ever coming into existence at all, an idea persuasively expressed by richard dawkins, who notes that 'we are going to die, and that makes us the lucky ones', as we have 'won the lottery of birth against all odds' (dawkins, 1998, p. 1) . in order to help build identification with this idea, one exercise may involve estimating the likelihood of one's existence, by calculating the probability of one's parents ever meeting, followed by grandparents, and so forth (menzies, 2012) , in order to help the client focus on the incredible unlikelihood of their own dna sequence ever existing at all, rather than focusing on the tragedy of their own impermanence. the recent covid-19 pandemic has caused an understandable surge in anxiety across the globe. much of the behavioural response to covid-19 can be understood through the lens of terror management theory, which argues that death anxiety drives much of human behaviour (greenberg, 2012) . from this perspective, reminders of death (of which there are many in the current pandemic), produce increases in attempts to avoid a physical death (such as by wearing protective gear or self-isolating) or ensure a symbolic immortality (such as by bolstering one's cultural worldviews, and aggressing against those that threaten them). death anxiety, which has recently been proposed to be a transdiagnostic construct (iverach et al., 2014) , appears to be more relevant now than ever before. in addition to predicting anxiety related to covid-19 (newton-john et al., 2020) , fear of death has also been shown to play a causal role across a number of mental health conditions (menzies and dar-nimrod, 2017; strachan et al., 2007) . given this, current standard treatments for mental health conditions may benefit from addressing death anxiety directly, in order to prevent the 'revolving door' often seen in mental health services (iverach et al., 2014, p. 590) . fortunately, cbt has been demonstrated to produce significant reductions in death anxiety, with exposure appearing to be particularly effective (menzies et al., 2018) . complementing current treatments with specific cbt techniques addressing fears of death may help to ensure the best long-term outcomes for clients, and protect the individual from future disorders. however, further research is needed to examine whether treating death anxiety will in fact reduce the likelihood of future mental health problems. acknowledgements. none. financial support. none. conflicts of interest. the authors declare no conflicts of interest. (1) increasing evidence suggests that death anxiety is a key transdiagnostic construct, and may contribute to various mental health conditions. (2) standard treatments for a variety of disorders may need to be supplemented with specific treatment targeting death anxiety. (3) recent evidence demonstrates that death anxiety can be effectively reduced using cbt with a focus on exposure therapy. (4) we suggest a number of cbt treatment strategies, including cognitive reframing of unhelpful thoughts, and exposure tasks tailored to the feared situations, themes or images the individual avoids. (5) future research is needed to examine whether directly addressing death anxiety does indeed produce long-term improvement in symptoms, and prevent future disorders. brisbane, australia: australian academic press. the fear of covid-19 scale: development and initial validation ebola salience, deaththought accessibility, and worldview defense: a terror management theory perspective the denial of death supersize my identity: when thoughts of contracting swine flu boost one's patriotic identity two decades of terror manegement research: a meta-analysis of mortality salience research the relationships among psychological separation, the quality of attachment, separation anxiety and death anxiety a word of caution about many labs 4: if you fail to follow your preregistered plan, you may fail to find a real effect viewing death on television increases the appeal of advertsied products unweaving the rainbow racism and discrimination in covid-19 responses how to be free: an ancient guide to the stoic life (a. a. long, trans.) terror mismanagement: evidence that mortality salience exacerbates attentional bias in social anxiety treating health anxiety and fear of death: a practitioner's guide the novel coronavirus (covid-2019) outbreak: amplification of public health consequences by media exposure feeling and time: the phenomenology of mood disorders, depressive realism, and existential psychotherapy dying to be thin: the effects of mortality salience and body mass index on restricted eating among women terror management theory: from genesis to revelations terror management theory: implications for understanding prejudice, stereotyping, intergroup conflict, and political attitudes. social and personality psychology compass sympathy for the devil: evidence that reminding whites of their mortality promotes more favorable reactions to white racists assessing the terror management analysis of self-esteem: converging evidence of an anxiety-buffering function get out of your mind and into your life: the new acceptance and commitment therapy beyond the living will multidimensional treatment of fear of death death anxiety and its role in psychopathology: reviewing the status of a transdiagnostic construct the varieties of religious experience currencies as cultural symbols -an existential psychological perspective on reactions of germans toward the euro why are so many baby boomers in denial over the coronavirus? vogue multiculturalism, racism and infectious disease in the global city: the experience of the 2003 sars outbreak in toronto fears and phobias: part ii. welfare magazine specific phobias many labs 4: failure to replicate mortality salience effect with and without original author involvement death anxiety, perfectionism and disordered eating coronavirus anxiety scale: a brief mental health screener for covid-19 related anxiety the collett-lester fear of death scale: the original version and a revision self-control moderates the association between perceived severity of the coronavirus disease 2019 (covid-19) and mental health problems among the chinese public social reaction toward the 2019 novel coronavirus (covid-19). social health and behavior internet searches for unproven covid-19 therapies in the united states death anxiety as a predictor of posttraumatic stress levels among individuals with spinal cord injuries terror management and aggression: evidence that mortality salience motivates aggression against worldview-threatening others cognitive and behavioural procedures for the treatment of death anxiety impermanence and the human dilemma: observations across the ages death anxiety and its relationship with obsessive-compulsive disorder the relationship between death anxiety and severity of mental illnesses the effect of mortality salience on body scanning behaviours in mental illnesses creative approaches to treating the dread of death the effects of psychosocial interventions on death anxiety: a meta-analysis and systematic review of randomised controlled trials the dread of death and its role in psychopathology the role of death fears in obsessive-compulsive disorder psychological distress and covid-19: estimations of threat and the relationship with death anxiety hypochondriasis and fear of death investigation of death anxiety among depressive patients a dual-process model of defense against conscious and unconscious death-related thoughts: an extension of terror management theory a time to tan: proximal and distal effects of mortality salience on sun exposure intents stereotypes and terror management: evidence that mortality salience enhances stereotypic thinking and preferences how to die: an ancient guide to the end of life terror management and meaning: evidence that the opportunity to defend the worldview in response to mortality salience increases the meaningfulness of life in the mildly depressed body as the source of threat and fear of death in hypochondriasis and panic disorder terror mismanagement: evidence that mortality salience exacerbates phobic and compulsive behaviours the impact of mortality salience on reckless driving: a test of terror management mechanisms beliefs about self, life, and death: testing aspects of a comprehensive model of death anxiety and death attitudes ebola as an existential threat? experimentally-primed ebola reminders intensify national-security concerns among extrinsically religious individuals immediate psychological responses and associated factors during the initial stage of the 2019 coronavirus disease (covid-19) epidemic among the general population in china death attitude profile-revised: a multidimensional measure of attitudes toward death coronavirus disease (covid-19): situation report -117 existential psychotherapy staring at the sun: overcoming the terror of death a systematic review of the psychometric properties of death anxiety self-report measures death anxiety in the time of covid-19: theoretical explanations and clinical implications. the cognitive behaviour therapist key: cord-226245-p0cyzjwf authors: schneble, marc; nicola, giacomo de; kauermann, goran; berger, ursula title: nowcasting fatal covid-19 infections on a regional level in germany date: 2020-05-15 journal: nan doi: nan sha: doc_id: 226245 cord_uid: p0cyzjwf we analyse the temporal and regional structure in mortality rates related to covid-19 infections. we relate the fatality date of each deceased patient to the corresponding day of registration of the infection, leading to a nowcasting model which allows us to estimate the number of present-day infections that will, at a later date, prove to be fatal. the numbers are broken down to the district level in germany. given that death counts generally provide more reliable information on the spread of the disease compared to infection counts, which inevitably depend on testing strategy and capacity, the proposed model and the presented results allow to obtain reliable insight into the current state of the pandemic in germany. in march 2020, covid-19 became a global pandemic. from wuhan, china, the virus spread across the whole world, and with its diffusion, more and more data became available to scientists for analytical purposes. in daily reports, the who provides the number of registered infections as well as the daily death toll globally (https://www.who.int/). it is inevitable for the number of registered infections to depend on the testing strategy in each country (see e.g. cohen and kupferschmidt, 2020) . this has a direct influence on the number of undetected infections (see e.g. li et al., 2020) , and first empirical analyses aim to quantify how detected and undetected infections are related (see e.g. niehus et al., 2020) . though similar issues with respect to data quality hold for the reported number of fatalities (see e.g. baud et al., 2020) , the number of deaths can overall be considered a more reliable source of information than the number of registered infections. the results of the "heinsberg study" in germany point in the same direction (streeck et al., 2020) . a thorough analysis of death counts can in turn generate insights on changes in infections as proposed in flaxman et al. (2020) (see also ferguson et al., 2020) . in this paper we pursue the idea of directly modelling registered death counts instead of registered infections. we analyse data from germany and break down the analyses to a regional level. such regional view is apparently immensely important, considering the local nature of some of the outbreaks for example in italy (see e.g. , france (see e.g. massonnaud et al., 2020) or spain. the analysis of fatalities has, however, an inevitable time delay, and requires to take the course of the disease into account. a first approach on modelling and analysing the time from illness and onset of symptoms to reporting and further to death is given in jung et al. (2020) (see also linton et al., 2020) . understanding the delay between onset and registration of an infection and, for severe cases, the time between registered infection and death can be of vital importance. knowledge on those time spans allows us to obtain estimates for the number of infections that are expected to be fatal based on the number of infections registered on the present day. the statistical technique to obtain such estimates is called nowcasting (see e.g. höhle and an der heiden, 2014) and traces back to lawless (1994) . nowcasting in covid-19 data analyses is not novel and is for instance used in günther et al. (2020) for nowcasting daily infection counts, that is to adjust daily reported new infections to include infections which occurred the same day but were not yet reported. we extend this approach to model the delay between the registration date of an infection and its fatal outcome. we therefore analyse the number of fatal cases of covid-19 infections in germany using district-level data. the data are provided by the robert-koch-institute (www.rki.de) and give the cumulative number of deaths in different gender and age groups for each of the 412 administrative districts in germany together with the date of registration of the infection. the data are available in dynamic form through daily downloads of the updated cumulated numbers of deaths. we employ flexible statistical models with smooth components (see e.g. wood, 2017) assuming a district specific poisson process. the spatial structure in the death rate is incorporated in two ways. first, we assume a spatial correlation of the number of deaths by including a long-range smooth spatial death intensity. this allows to show that regions of germany are affected to different extents. on top of this long-range effect we include two types of unstructured region specific effects. an overall region specific effect reflects the situation of a district as a whole, while a short-term effect mirrors region specific variation of fatalities over time and captures local outbreaks as happened in e.g. heinsberg (north-rhine-westphalia) or tirschenreuth (bavaria). in addition we include dynamic effects to capture the global changes in the number of fatal infections for germany over calendar time. this enables us to investigate the impact of certain interventions, such as social distancing, school closure, complete lockdowns and lockdown releases, on the dynamic of the infection and hence on the number of deaths. modelling infectious diseases is a well developed field in statistics and we refer to held et al. (2017) for a general overview of the different models. we also refer to the powerful r package surveillance . since our focus is on analysing the district specific dynamics of fatal infections we here make use of poisson-based models implemented in the mgcv package in r, which allows to decompose the spatial component in more depth. the paper is organized as follows. in section 2 we describe the data. section 3 highlights the results of our analysis. the remaining sections provide the technical material, starting with section 4 where we motivate the statistical model, which is extended by our nowcasting model in section 5. extended results as well as model validation are given in section 6, while section 7 concludes the paper. we make use of the covid-19 dataset provided by the robert-koch-institute for the 412 districts in germany (which also include the twelve districts of berlin separately). the data are updated on a daily basis and can be downloaded from the robert-koch-institute's website. we have daily downloads of the data for the time interval from march 27, 2020 until today. the subsequent analysis was conducted on may 14, 2020, and was performed considering only deadly infections with registration dates from march 26, 2020 until may 13, 2020 (the day before the day of analysis). the data contain the newly notified laboratory-confirmed covid-19 infections and the cumulated number of deaths related to covid-19 for each district of germany, classified by gender and age group. each data entry has a time stamp which corresponds to the registration date of a confirmed covid-19 infection. this means that the time stamp for a fatal outcome always refers to the registration date and not to the death date. due to daily downloads of the data we can derive the time point of death (or to be more specific, the time point when the death of a case is included in the database). we obtain the latter by observing a status change from infected to deceased when comparing the data from two consecutive days. the robert-koch-institute collects the data from the district-based health authorities (gesundheitsämter). due to different population sizes in the districts and certainly also because of different local situations, some health authorities report the daily numbers to the robert-koch-institute with a delay. this happens in particular over the weekend, a fact that we need to take into account in our model. we refrain from providing general descriptive statistics of the data here, since these numbers can easily be found on the rki webpage, which also gives a link to a dashboard to visualize the data (see also https://corona.stat.uni-muenchen.de/maps/) before we discuss our modelling approach in detail, we want to describe our major findings. first, table 1 shows that age and gender both play a major role when estimating the daily death toll. as is generally known, elderly people exhibit a much higher death rate which is for the age group 80+ around 100 times higher than for people in the age group 35-59. a remarkable difference is also observed between genders, where the expected death rate of females is around 40% (≈ 1 − exp(−0.503)) lower than the death rate for males. furthermore, we see that significantly less deaths are attributed to infections registered on sundays compared to weekdays, due to the existing reporting delay during weekends. our model includes a global smooth time trend representing changes in the death rate since march 26th. this is visualized in figure 1 . the plotted death rate is scaled to give the expected number of deaths per 100.000 people in an average district for the reference group, i.e. males in the age group 35 -59. overall, we see a peak in the death rate on april 3rd and a downwards slope till end of april. however, our nowcast reveals that the rate remains constant since beginning of may. note that this recent development cannot be seen by simply displaying the raw death counts of these days. the nowcasting step inevitably carries statistical uncertainty, which is taken into account in figure 1 by including best and worst case scenarios. the latter are based on bootstrapped confidence intervals, where details are provided in section 6.3 later in the paper. our aim is to investigate spatial variation and regional dynamics. to do so, we combine a global geographic trend for germany with unstructured region-specific effects, where the latter uncover local behaviour. in figure 2 we combine these different components and map the fitted nowcasted death counts related to covid-19 for the different districts of germany, cumulating over the last seven days before the day of analysis (here may 14, 2020). while in most districts of germany the death rate is relatively low, some hotspots can be identified. among those, traunstein and rosenheim (in the south-east part of bavaria) are the most evident, but greiz and sonneberg (east and south part of thuringia) stand out as well, to mention a few. a deeper investigation of the spatial structure is provided in section 6, where we show the global geographic trend and provide maps that allow to detect new hotspot areas, after correcting for the overall spatial distribution of the infection. on the day of analysis, we do not observe the total counts of deaths for recently registered infections, since not all patients with an ongoing fatal infections have died yet. we therefore nowcast those numbers, i.e. we predict the prospective deaths which can be attributed to all registration dates up to today. this is done on a national level, and the resulting nowcast of fatal infections for germany is shown in figure 3 . for example, on may 14, 2020 there are 25 deaths reported where the infection was registered on may 5th (red line on may 5th). we expect this number to increase to about 50 when all deaths due to covid-19 for this registration date will have been reported (blue line on may 5th). naturally, the closer a date is to the present, the larger the uncertainty in the nowcast. this is shown by the shaded bands. details on how the statistical uncertainty has been quantified are provided in section 5 below. the fit of this model has been incorporated into the district model discussed before, but the nowcast results are interesting in their own right. the curve confirms that the number of fatal infections is decreasing since the beginning of april. note that the curve also mirrors the "weekend effect" in registration, as less infections are reported on sundays. further analyses and a detailed description of the model are given in the following sections. let y t,r,g denote the number of daily deaths due to covid-19 in district/region r and age and gender group g with time point (date of registration) t = 0, . . . , t . here t = t corresponds to the day of analysis, which is may 14, 2020 and t = 0 corresponds to march 26, 2020. note that time point t refers to the time point of registration, i.e. the date at which the infection was confirmed. even though the time point of infection obviously precedes that of death, registration can also occur after death, e.g. when a post mortem test is conducted, or when test results arrive after the patient has passed away. we set the day of death to be equal to the day of registered infections in this case. the majority of fatalities with registered infection at time point t have not yet been observed at time t, as these deaths will occur later. we therefore need a model for nowcasting, which is discussed in the next section. for now we assume all y t,r,g to be known. we model y t,r,g as (quasi-)poisson distributed according to where we specify λ t,r,g through λ t,r,g = exp{(β 0 + age g β age + gender g β gender + weekday t β weekday the linear predictor is composed as follows: • β 0 is the intercept. • β age and β gender are the age and gender related regression coefficients. • β weekday are the weekday-related regression coefficients. • m 1 (t) is an overall smooth time trend, with no prior structure imposed on it. • m 2 (s r ) is a smooth spatial effect, where s r is the geographical centroid of district/region r. • u r0 and u r1 are district/region-specific random effects which are i.i.d. and follow a normal prior probability model. while u r0 specifies an overall level of in the death rate for district r over the entire observation time, u r1 reveals region specific dynamics by allowing the regional effects to differ for the last 14 days. • pop r,g is the gender and age group-specific population size in district/region r and serves as an offset in our model. we here emphasize that we fit two spatial effects of different types: we model a smooth spatial effect, i.e. m 2 (s r ), which takes the correlation between the death rates of neighbouring districts/regions into account and gives a global overview of the spatial distribution of fatal infections. in addition to that we also have unstructured district/region-specific effects u r = (u r0 , u r1 ) , which capture local behaviour related to single districts only. the district specific effects u r are considered as random with a prior structure for r = 1, . . . , 412. the prior variance matrix σ u is estimated from the data. the predicted values u r (i.e. the posterior mode) exhibit districts that show unexpectedly high or low death tolls when adjusted for the global spatial structure and for age-and gender-specific population size. model (1) belongs to the model class of generalized additive mixed model, see e.g. wood (2017) . the smooth functions are estimated by penalized splines, where the quadratic penalty can be comprehended as a normal prior (see e.g. wand, 2003) . the same type of prior structure holds for the region-specific random effects u r . in other words, smooth estimation and random effect estimation can be accommodated in one fitting routine, which is implemented in the r package mgcv. this package has been used to fit the model, so that no extra software implementation was necessary. this demonstrates the practicability of the method. the above model cannot be fitted directly to the available data, since we need to take the course of the disease into account. for a given registration date t, the number of deaths of patients registered as positive on that day, y t,r,g , may not yet be known, since not all patients with a fatal outcome of the disease have died yet. this requires the implementation of nowcasting. we do this on a national level, and cumulate the numbers over district/region r and gender and age groups g. this allows to drop the corresponding subscripts in the following and we simply notate the cumulated number of deaths with registered infections at day t with y t . let n t,d denote the number of deaths reported on day t + d for infections registered on day t. assuming that the true date of death is at t+d, or at least close to it, we ignore any time delays between time of death and its notification to the health authorities. we call d the duration between the registration date as a covid-19 patient and the reported day of death, where d = 1, . . . , d max . here, d max is a fixed reasonable maximum duration, which we set to 30 days (see e.g. wilson et al., 2020) . the minimum delay is one day. in nowcasting we are interested in the cumulated number of deaths for infections registered on day t, which we define as the total number of deaths with a registered infection at t is apparently unknown at time point t and becomes available only after d max days. in other words, only after d max days we know exactly how many deaths occurred due to an infection which was registered on day t. we define the partial cumulated sum of deaths as on day t = t , when the nowcasting is performed, we are faced with the following data constellation, where na stands for not (yet) available: we may consider the time span between registered infection and (reported) death as a discrete duration time taking values d = 1, . . . , d max . let d be the random duration time, which by construction is a multinomial random variable. in principle, for each death we can consider the pairs (d i , t i ) as i.i.d. and we aim to find a suitable regression model for d i given t i , including potential additional covariates x t,d . we make use of the sequential multinomial model (see agresti, 2010) and define π(d; t, x t,d ) = p (d = d|d ≤ d; t, x t,d ) let f t (d) denote the corresponding cumulated distribution function of d which relates to probabilities π() through f t (d) = p t (d ≤ d) = p(d ≤ d|d ≤ d + 1) · p (d ≤ d + 1) = (1 − π(d + 1; ·)) · (1 − π(d + 2; ·)) · . . . · (1 − π(d max ; ·)) for d = 1, . . . , d max − 1 and f t (d max ) = 1. the available data on cumulated death counts allow us to estimate the conditional probabilities π(d; ) for d = 2, . . . , d max . in fact, the sequential multinomial model allows to look at binary data such that where • s 1 (t) is an overall smooth time trend over calendar days, • s 2 (d) is a smooth duration effects, capturing the course of the disease, • x t,d are covariates which may be time and duration specific. assuming that d, the duration between a registered fatal infection and its reported death, is independent of the number of fatal covid-19 infections, we obtain the relationship note further that if we model y t with a quasi-poisson model as presented in the previous chapter, we have no available observation y t for time points t > t − d max . instead, we have observed c t,t −t , which relates to the mean of y t through (7). including therefore log f t (t − t) as additional offset in model (2), allows to fit the model as before, but with nowcasted deaths included. that means, instead of λ t,r,g as in (2), the expected death rates are now parametrized by λ t,r,g = λ t,r,g exp(log f t (t −t)), where the latter multiplicative term is included as additional offset in the model. we fit the nowcasting model (5) with parametrization (6). we include a weekday effect for the registration date of the infection with reference category "monday". the estimates of the fixed linear effects are shown in table 2 . the fitted smooth effects are shown in figure 4 , where the top panel shows the effect over calendar time, which is very weak and confirms that the course of the disease hardly varies over time. this shows that the german health care system remained stable over the considered period, and hence survival did not depend on the date on which the infection was notified. the bottom panel of figure 4 shows the course of the disease as a smooth effect over the time between registration of the infection and death. we see that the probabilities π(d; ·) decrease in d, where this effect is the strongest in the first days after registration. thus, most of the covid-19 patients with fatal infections are expected to die not long after their registration date. the effect of d becomes easier to interpret by visualizing the resulting distribution function f t (d). this is shown in figure 5 for two dates t, i.e.. april 14th and may 13th. the plot also shows how the course of the disease hardly varies over calendar time: in fact, the small differences between the two distribution functions is dominated by the weekday effect, since the red curve is related to a tuesday while the blue one is from a wednesday. in figure 3 above we have shown the nowcasting results along with uncertainty intervals shaded in grey. these were constructed using a bootstrap approach as follows. given the fitted model, we simulate n = 10 000 times from the asymptotic joint normal distribution (7), where c t −t is the observed partial cumulated sum of deaths at time point t − t. the pointwise lower and upper bounds of the 95% prediction intervals for the nowcast for y t are then given by the 2.5 and the 97.5 quantiles of the set { y (i) t , i = 1, . . . , n}, respectively. in section 3 we presented the fitted death rate, which is the convolution of a smooth spatial effect as well as region specific effects. it is of general interest to disentangle these two spatial components. this is provided by the model. we visualize the fitted global geographic trend figure 7: long term region specific level (left hand side) and short term dynamics (right hand side) of the covid-19 infections m 2 (·) for germany in figure 6 . the plot confirms that up to may 2020 the northern parts of the country are less affected by the disease in comparison to the southern states. the two plots in figure 7 map the region specific effects, i.e. the predicted long term level of a district u r0 (left hand side) and the predicted short term dynamics u r1 (right hand side). both plots uncover quite some region-specific variability. in particular, the short term dynamics captured in the right hand side plot (u r1 ) pinpoint districts with unexpectedly high nowcasted death rates in the last two weeks, after correcting for the global geographic trend and the long term effect of the district. some of the noticeable districts have already been highlighted in section 3 above, but we can detect further districts, which are less pronounced in figure 1 . for instance, steinfurt (in the north-west of north rhine-westphalia), olpe (southern north rhine-westphalia) or gotha (center of thringen) presently show a high rate of fatal infections. a large number of the registered deaths related to covid-19 stem from people in the age group 80+. locally increased numbers are often caused by an outbreak in a retirement home. such outbreaks apparently have a different effect on the spread of the disease, and the risk of an epidemic infection caused by outbreaks in this age group is limited. thus, the death rate of people in the age group 80+ could vary differently across districts when compared to regional peaks in the death rate of the rest of the population. in order to respect this, we decompose the district-specific effects u r in (2) into u 80− r = (u 80− r0 , u 80− r1 ) for the age group 80-and u 80+ r = (u 80+ r0 , u 80+ r1 ) for the age group 80+, where the age group 80-consists of the aggregated age groups 15-34, 35-59 and 60-79. we put the same prior assumption on the random effects as we did in (3), but now the variance matrix that needs to be estimated from the data has dimension 4 by 4. the fitted age group-specific random effects are shown in figure 8 , where the u 80− r are shown in the top panel and the u 80+ r in the bottom panel. most evidently, the variation of the random effects is much higher in the age group 80+ when compared to the younger age groups, as more districts occur which are coloured dark blue or dark red, respectively. when comparing the district-specific short term dynamics of the last 14 days (u r1 ) in figure 8 to those in figure 7 , we recognize that in most of the districts which recently experienced very high death intensities (with respect to the whole period of analysis), these stem from the age group 80+. as mentioned before, this can often be explained by outbreaks in retirement homes. when fitting the mortality model (1) we included the fitted nowcast model as offset parameter. this apparently neglects the estimation variability in the nowcasting model, which we explored via bootstrap as explained in section 5.3 and visualized in figure 3 . in order to also incorporate this uncertainty in the fit of the mortality model, we refitted the model using (a) the upper end and (b) the lower end of the prediction intervals shown in figure 3 . it appears that there is little (and hardly any visible) effect on the spatial components, which is therefore not shown here. but the time trend shown in figure 1 does change, which is visualized by including the two fitted functions corresponding to the 2.5% and 97.5% quantile of the offset function. we can see that the estimated uncertainty of the nowcast model mostly affects the last ten days, with a strong potential increase in the death rate mirroring a possible worst case scenario. in figure 9 we show a normal qq-plot of the pearson residuals in the nowcasting model. apart from some observations in the lower tail, the pearson residuals are distributed very closely to a standard normal distribution when considering the estimate φ = 1.766 of the dispersion parameter in the quasi-poisson model (7). overall, the model seems to fit to the available data quite well. the paper presents a model to monitor the dynamic behaviour of covid-19 infections based on death counts. it is important to highlight that the proposed model makes no use of new infection numbers, but only of observed deaths related to covid-19. this in turn means that the results are less dependent on testing strategies. the nowcasting approach enables us to estimate the number of deaths following a registered infection today, even if the fatal outcome has not occurred yet. moreover, the district level modelling uncovers hotspots, which are salient exclusively through increased death rates. a differential analysis of the number of current fatal infections on a regional level allows to draw conclusions on the current dynamics of the disease assuming a constant case fatality rate, i.e. a stable proportion of death compared to the true number of infections when adjusting for age and gender. a natural next step would now be to consider the nowcasted deaths in relation to the number of newly registered infections, which is, in contrast, highly dependent on both testing strategy and capacity. we consider this as future research, and the proposed model allows us to explore data in this direction. this might ultimately help us in shedding light on the relationship between registered and undetected infections as well as on the effectiveness of different testing strategies. there are several limitations to this study which we want to address as well. first and utmost, even though death counts are, with respect to cases counts, less dependent on testing strategies, they are not completely independent from them. this applies in particular to the handling of post-mortem tests. we therefore do not claim that our analysis of death counts is completely unaffected by testing strategies. secondly, a fundamental assumption in the model is the independence between the course of the disease and the number of infections. overall, if the local health systems have sufficient capacity and triage can be avoided, this assumption seems plausible, but it is difficult or even impossible to prove the assumption formally. finally, the nowcasting itself is not carried out on a regional level, though the model focuses on regional aspects of the pandemic. while it would be desirable to fit the nowcast model regionally, the limited amount of data simply prevents us from extending the model in this direction. analysis of ordinal categorical data real estimates of mortality following covid-19 infection countries test tactics in 'war' against covid-19 estimating the number of infections and the impact of non-pharmaceutical interventions on covid-19 in 11 european countries critical care utilization for the covid-19 outbreak in lombardy, italy: early experience and forecast during an emergency response baseline characteristics and outcomes of 1591 patients infected with sars-cov-2 nowcasting the covid-19 pandemic in bavaria probabilistic forecasting in infectious disease epidemiology: the 13th armitage lecture bayesian nowcasting during the stec o104:h4 outbreak in germany real-time estimation of the risk of death from novel coronavirus (covid-19) infection: inference using exported cases adjustment for reporting delays and the prediction of occurred but not reported events substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data covid-19: forecasting short term hospital needs in france. medrxiv spatio-temporal analysis of epidemic phenomena using the r package surveillance quantifying bias of covid-19 prevalence and severity estimates in wuhan, china that depend on reported cases in international travelers infection fatality rate of sars-cov-2 infection in a german community with a super-spreading event smoothing and mixed models case-fatality risk estimates for covid-19 calculated by using a lag time for fatality generalized additive models: an introduction with r we want to thank maximilian weigert and andreas bender for introducing us to the art of producing geographic maps with r. moreover, we would like to thank all members of the corona data analysis group (codag) at lmu munich for fruitful discussions. key: cord-176131-0vrb3law authors: bao, richard; chen, august; gowda, jethin; mudide, shiva title: pecaiqr: a model for infectious disease applied to the covid-19 epidemic date: 2020-06-17 journal: nan doi: nan sha: doc_id: 176131 cord_uid: 0vrb3law the covid-19 pandemic has made clear the need to improve modern multivariate time-series forecasting models. current state of the art predictions of future daily deaths and, especially, hospital resource usage have confidence intervals that are unacceptably wide. policy makers and hospitals require accurate forecasts to make informed decisions on passing legislation and allocating resources. we used us county-level data on daily deaths and population statistics to forecast future deaths. we extended the sir epidemiological model to a novel model we call the pecaiqr model. it adds several new variables and parameters to the naive sir model by taking into account the ramifications of the partial quarantining implemented in the us. we fitted data to the model parameters with numerical integration. because of the fit degeneracy in parameter space and non-constant nature of the parameters, we developed several methods to optimize our fit, such as training on the data tail and training on specific policy regimes. we use cross-validation to tune our hyper parameters at the county level and generate a cdf for future daily deaths. for predictions made from training data up to may 25th, we consistently obtained an averaged pinball loss score of 0.096 on a 14 day forecast. we finally present examples of possible avenues for utility from our model. we generate longer-time horizon predictions over various 1-month windows in the past, forecast how many medical resources such as ventilators and icu beds will be needed in counties, and evaluate the efficacy of our model in other countries. we used the county-level cumulative and daily death reporting from the new york times [1] , as well as the county-level active cases reporting from johns hopkins university [2] . we also used county-level population statistics from the 2017 american community survey and county-level policy dating information from johns hopkins university. specifically, we loaded the policy dating information regarding the implementation of the stay at home orders. we loaded these data into a data frame by matching the county fips codes from each source, and computed a moving average death statistic as an additional feature, using a window of size three days. we dealt with missing values in the fips codes and death reporting by removing the corresponding observations. the only exception to this was the county 36061, which represents new york city. this county had a missing fips code, but had well reported data. since new york city is the most active covid-19 hotspot in the united states, its manual inclusion was necessary. in addition to the aforementioned data sources, we experimented with mobility data, but this did not make it into our final epidemiological model because there was no simple mapping to any of the model variables, and we believed the policy dating information was sufficient to establish distinct regimes in the data. the traditional sir epidemiological model breaks up a regionâăźs population into three separate groups: susceptible, infected, and removed [3] . the primary flaw of applying this model to the current covid-19 pandemic is the poor approximation of assuming that all people in each group have uniform experiences. clumping the population into only three groups is an example of omitted-variable bias, excluding major realities caused by the scale of the pandemic. the pecaiqr model that is described below takes each group in the sir model and breaks them down into another level of classification. the variables in our model are developed following simple logical arguments based on current global realities and widely accepted scientific and epidemiological results. we first break down the susceptible class of the sir model. due to policies implemented in the us, many individuals have been self-quarantining [4] . it is sensible then that each day only a fraction of the susceptible population are exposed to potentially getting the virus from the outside world. following this logic, our model breaks up the susceptible population into three classes: protected, exposed, and carriers (explained in detail below). next, we consider the infected group of sir. there is evidence from the large data sets gathered in south korea and other well-respected global scientific efforts that a significant fraction of people infected with covid-19 do not display visible symptoms [5] . using this line of logic our model breaks the infected group into two classes: asymptomatic and infectious. an important note is that we assume these asymptomatic and infectious people still actively participate in the community, enabling them to come in contact with exposed people. finally, we are left with the removed group. our model incorporates two lines of logic in the breakdown of this group. first, it is sensible that a portion of the members of the asymptomatic and infectious classes end up self-quarantining (at least in effect) as a result of: getting tested, showing initial symptoms, or having an âăijintuitionâăi̇ they contracted the disease. second, much of the evidence about covid-19 so far shows that an individual who contracts the disease cannot get the disease again (at least on the order of a few months) [6] . following these arguments, the model breaks the removed group into: quarantined and removed classes. removed further has subclasses of dead people and recovered people, who are assumed to be immune from the virus. each of the variables we describe change continuously with time. we describe the variables for a given time t (associated with some given day t ). protected : people who did not go outside to expose themselves to any infected individuals (asymptomatic or infectious) on t, though they have a chance of coming in contact with a carrier. any person in protected also has a chance of joining the exposed class at a later time t + δ (interpreted as the next day: t + 1), as they may wish to travel outside that day for any reason. exposed: people who went outside on t and therefore had a chance of coming into contact with an infected individual (asymptomatic or infectious) to become a carrier. any member also has a chance of joining the protected class on the following day t + δ, as they may wish to self-isolate the next day for any reason. exposed : people who went outside on t and therefore had a chance of coming into contact with an infected individual (asymptomatic or infectious) to become a carrier. any member also has a chance of joining the protected class on the following day t + δ, as they may wish to self-isolate the next day for any reason. carrier : people in the model in a purposely temporary position. they are people in the exposed class who came into contact with an asymptomatic or infectious person and got the disease âăijon their handsâăi̇ to some degree. carriers returning home on t have a chance of spreading the disease to some protected people living at their home, making those people either asymptomatic or infectious on t+δ. carriers at time t can also have a chance to either âăijtouch their faceâăi̇ and contract the disease themselves (becoming asymptomatic or infectious on t + δ), or âăijwash their handsâăi̇ and return to being a member of the exposed class on t + δ. asymptomatic: people who were infected by covid-19 and are contagious on t, but show no symptoms. these people are assumed to be active in the public, in that they have a chance to spread the virus to exposed people in public areas on t. an asymptomatic person on t + δ has a chance of becoming a member of infectious on after showing initial symptoms, a member of quarantined after somehow finding that they contracted the virus, or a member of removed after having the disease pass through their immune system. here, all the people going from asymptomatic to removed would go to the recovered subclass. infectious: people who were infected by covid-19, are contagious on t, and show symptoms. these people are assumed to be active in the public, in that they have a chance to spread the virus to exposed people in public areas on t. an infectious person on t + δ has a chance of becoming a member of quarantined after figuring that they contracted the virus and a chance of becoming a member of removed after having the disease pass through their immune system. when these infectious people become a member of removed they have a chance of dying, to become a member of the dead, or living and becoming a member of recovered. quarantined : people who were originally asymptomatic or infectious and subsequently removed themselves from the public and are self-quarantining on t . these people are assumed to not be able to spread the disease to any people, so this group includes people who have covid-19 but are in a non-contagious stage. a quarantined person on t + δ has a chance of becoming a member of removed. when these quarantined people become a member of removed they have a chance of dying, to become a member of the dead, or living and becoming a member of recovered. removed : people who were originally asymptomatic, infectious, or quarantined, and had the disease fully pass through their immune system on or before t. these people belong to one of two subclasses: dead and recovered. recovered people are assumed to have survived the disease, and will not get infected again. to fit the deaths data to the system of differential equations in the pecaiqr model, we performed numerical integration using the scipy odeint package [7] , and traversed the parameter space to find a set of parameters that minimized the least squares error of each fit variable in relation to its observed variables. due to the size of the parameter space, this requires an initial guess for the parameters and the initial conditions of each of the 7 pecaiqr variables, as well as defined ranges to restrict the size of the parameter space. of course, we can simply examine the entire logical parameter space (with allowed values ranging from 0 to 1) for the each of the parameters, and for each of the pecaiqr variables as well, and then use a random guess within that space to initialize the least squares minimization. however, this is unnecessarily inefficient. the actual parameters may differ widely from county to county, but they all share a similar order of magnitude. a better approach is to use this exhaustive, unconstrained search only once, on a relatively mature curve like lombardi in italy or new york in the united states, to extract a reasonable guess for these orders of magnitude, and then simply feed in this guess for the other counties as well, to provide a more logical starting point in the least squares minimization. due to the complexity of our model, however, there are a few more nuanced details that we will discuss later in section. first, we will elaborate more on the fitting procedure. initially, we only fit the death variable (d) to the observed death reporting. early on, we decided not to fit our infection curves to daily case statistics, as they have inconsistent and unreliable reporting, so this would only diminish the accuracy of the fit to observed deaths, which is a much more reliable statistic. however, later on, we realized that we could also effectively fit the quarantined variable (q) in the pecaiqr model to the active cases reporting. the intuition behind this is the assumption that those who test positive would either self quarantine at home or be forcefully quarantined in a hospital if their condition is severe enough. of course, this does not capture all the self-quarantined individuals, so we permitted a large degree of fuzziness in the fit of the quarantined variable (q) to active cases. we achieved this with a bias scaled factor that gave much more weight to observed deaths in the fit of the death variable (d). this causes the model to prioritize the deaths fit over the active cases fit, so the active cases fit becomes more of a suggestion rather than a constraint. the hope is that the least squares error on the active cases is not large enough in magnitude to compromise the deaths fit and force the parameter space into a different minimum, but will rather provide a subtle correction around the local minimum discovered by the minimization of the least squares error on the deaths. unfortunately, a fit to active cases is not helpful for most counties, as most counties do not maintain their active cases reporting well, and moreover we discovered that the criteria for an active case may vary widely across different states. we also realized that the pecaiqr epidemiological model parameters are not static, due to the dynamic and rapidly evolving state of the pandemic, caused mainly by external forces such as social distancing protocols and lock down policies. a naive fit on the data would only yield some sort of average of the parameters. but since we only care about the most recent characteristics of the death and infection curves when making predictions into the future, we can do better than this by affording the more recent data points more weight in the least squares error calculation. this forces the minimization to favor a solution that fits more heavily on a more recent window of time, while still retaining the effects of the past data to some degree. we can achieve this with a geometric progression of weights, as well as a bias term that sets a maximum weight for all data points before a certain cutoff. we called this method training on the tail. another method to separate more recent parameters from the data is to use the assumption that there are distinct parameter regimes correlating to the start of policies -the stay at home orders in particular. note that these policies directly affect the infection curves because they limit the spread of disease. for the death curve, there is typically an offset between the date of policy implementation and the date at which the effects become apparent in the death data, this offset approximately equal to the average time till death for covid-19. with this assumption, we separate the deaths data into two training regimes -one for dates before the policy implementation plus the offset, and one for dates after the policy implementation plus the offset. then, we train one fit on the the first regime, and feed the fitted parameters and predicted variables on the date of the policy implementation as the guesses and initial conditions, respectively of a second fit on the second regime. we called this method training on the policy regime. we perform procedures described above on a county-level, and then use the fitted parameters with the numerical integration to extrapolate into the future to get county-specific predictions. note that the predictions and their errors are in cumulative deaths, but for the sake of visualization, we converted to daily deaths later. to get the errors, we initially used a method that calculated error bounds by finding the parameter variance from the residual variance using the covariance matrix of residuals and the jacobian around the fitted parameters. having obtained the mean and standard deviation for each of the parameters, we assumed that each parameter had normally distributed values, and so we sampled 100 parameter sets. we could not simply apply calculate the parameters for each confidence interval from their mean and standard deviation because it is not obvious which parameters are positively correlated or negatively correlated with the deaths predictions. this method turned out not to be ideal in all cases, as there are certain regions in the parameter space that are invalid and cause the error bars to spike. we then developed a bounding method that allowed a reliable way to accurately tighten our confidence intervals. this method infers the error bars from the deviation relative to the predicted fit, of a smoothed version of the residuals calculated from the moving average of daily deaths, which is equivalent to the moving average of the slope of the cumulative deaths. generate a pdf of deaths around the best fit prediction. for each time t in the training range, compute the difference in âăijslopeâăi̇ between the fit and the actual data. the slope in the fit is simply the current predicted death minus the previous death on the fit curve. for the slope in the actual data, to mitigate the dominating effects of outliers, we find the slope as the difference between consecutive points of the moving average (window of 3 days) instead. we then define a normal distribution of slope ratios. for each time t in the training data, we find the ratio of the actual (moving average) slope and the fit slope, and fill a list of these ratios. next we find the mean and standard deviation of the distribution as the mean and (sample) standard deviation of the ratios list. outliers, which we remove, are defined as values that are above three standard deviations from the mean. this is necessary because the ratios are unrealistically high in the small number limit. for each time t in the extrapolated prediction fit we generate a confidence range of 100 points. we find the slope at the extrapolated time as the predicted death minus the predicted death at the previous time. this slope is multiplied by a random scaler sampled from the normal distribution of ratios and denoted as s. we multiply the predicted death at the previous time t − 1 in the fit by 1 + s and add this as a point in the pdf at time t. this multiplication is repeated 100 times so that 100 points are generated. we get the discrete 10, 20, 30, 40, 50, 60, 70, 80, 90 cdf percentiles by sampling the pdf. the methods described in the past two subsections are implemented as options that can be activated with hyper parameters, and collectively they provide several different ways to fit the pecaiqr model and generate the confidence intervals. due to the time constraint of the competition, we were not able to develop a sophisticated blending method to optimize on the hyper-parameter space. however, we were able to sample a few different combinations of hyper-parameters and determine on a county-level which one works the best for each county, by modifying the evaluation script provided by the tas. this script essentially computes pinball [8] loss for the submission file of predictions, scored against the most recent data. to determine a good set of hyperparameters for each county, we simply trained using a two week cutoff in the data, and scored the predictions for the subsequent two weeks using the evaluation script. for predictions made from training data up to may 25th, we were able to obtain an average pinball loss of 0.096 for all counties with scoring on a 14 day forecast. our greatest weakness was the lack of a second working model with which we could cross validate, as well as the lack of a sophisticated blending method to optimize on the hyper-parameter space for the single model. we were not able to develop an alternative working model due to the time constraints of our group members, but a detailed description of our attempts is available in section 5 failed models. having two different models would have allowed us to mitigate their individual weaknesses and account for their individual edge cases with the other's strengths. the epidemiological model has several major weaknesses. although it works well for counties with well recorded data, it fails for the vast majority of counties, which have low or noisy death statistics. we called these counties the "non-convergent counties" because the epidemiological model was not able to converge to a parameter set that yielded predictions which could consistently score better than the naive all zeros prediction. the curves for the pecaiqr variables in the left column of the preceding figures make sense intuitively. we expect expect to see that conversion between the protected and exposed populations, and that the removed population to eventually dominate. we also expect the carrier population to peak before the infected population, which should have a similar shape to the asymptomatic population, and lastly we expect the infected and asymptomatic populations to peak before the deceased population peaks. lastly, the second figure, which uses the "fit on tail" method, more closely captures the uncertainty of the tail end of the curve, as expected. (a) (b) (c) (d) (e) (f) (g) (h) for the "nonconvergent counties," we attempted to skip over the parameter fitting step by guessing parameters based on the parameters of similar counties. in order to do this, we need to first establish proof of concept that there is some correlation between the non-covid features of a county and its pecaiqr parameters. so we visualized the parameters for all the "convergent" counties by using dimension reduction algorithms. we used principal component analysis [9] , singular value decomposition [10], and t-distributed stochastic neighbor embedding figure 3 : several different visualizations of the parameter space using dimension reduction algorithms. each point represents a count, and its size correlates with the county population. for svd, we plotted the first two columns of the left matrix against each other, which produces an visualization how similar the counties are by distance, based on their abstract preferences in the parameter space. although we did not have time to further explore the possible correlation between parameter clusters and non-covid county-specific features, we were excited to see that there are some distinct intrinsic patterns in the parameter space. further research would involve classifying these clusters with a clustering algorithm, and then attempting to predict the cluster labels or quantified distances of specific counties with a regression algorithm from the non-covid feature space. we also realized that, even for the larger counties, there was some uncertainty caused by the complexity of our model. since our parameter space has so many dimensions, there is an issue with degeneracy. the least squares minimization settles on a parameter set that yields a minimum in the least squares error metric, but this is not necessarily the only minimum or even the best minimum. we were convinced of this when we discovered that for each county, we could get distinct parameter regimes with different initial guesses, and that these solutions are similarly valid. we will show this in the figures below, by repeating the procedure from the previous section, but with a different set of parameters as the initial guess for the least squares minimization. following these observations, we hypothesized that there could be better parameter guesses that we have not encountered. in order to get better parameter guesses, further research should focus on deriving reasonable values for the more intuitive parameters of the model, such as those concerning the rate of infection, from existing data. this also illuminates the issue of ode stiffness. we believe the functionality of the scipy odeint package is quite limited. given more time, we would explore other statistical packages like stan, which has a state of the art implementation of 4th order runge kutta numerical integration for stiff odes [11] . another improvement would be to use a numerical integrator that solve handle delay differential equations, which would allow us to have more control over the time delay in expression between infected and death states. we believe that overall the pecaiqr model is very promising, and reveals the benefit of attempting more ambitious, complex epidemiological models. the epidemiological model's differential equations establishes intuitive rules by which it operates, so it has more long term predictive power than most other models. the shape of the solutions of the pecaiqr are also quite interesting, as they resemble heavy tail distributions similar to the frechet/weibull distribution. the heavy tail is especially important for pandemic forecasting, as we expect the daily deaths to fluctuate around a low value for some time, instead of decaying immediately to zero. in this respect, the tail end of the curve is almost stochastic in nature, once the infection curves lose enough momentum. due to the high variability found in the data for reported deaths for each county, with some counties reporting unrealistic jumps in the numbers of deaths, it seemed sensible to try using a stochastic model to make some predictions of death counts for counties. in order to best preprocess the county data for training, we explored possible clustering methods that would be able to cluster time series of varying lengths. dynamic-time warping (dtw) was picked to be the distance measure of choice as it is able to compare time series of different lengths and provide a âăijwarped distanceâăi̇ between each series of daily reported deaths. the daily reported deaths were preprocessed by first removing the series with all 0âăźs, and then performing a z-normalization on each data series. the data series were then compared with dynamic-time warping, and a hierarchical clustering was developed based on the dtw distance matrix. the number of clusters was determined using the elbow method. the series that had all 0âăźs are then added back into the clustering list, with cluster id âăÿ0âăź. it was desired that further clustering be done, specifically clustering involving hmm-based clustering. the ideal setup would be a clustering based on an iterative dtw-hmm clustering algorithm, in order to fully extract similarities between county reported deaths. unfortunately we did not have time to fully develop this idea. once the clusterings were developed, each cluster was used to train an hmm model. we used hmmlearnâăźs gaussianhmm model in order to have the gaussian emissions needed for this kind of data (a multinomial model with discrete output would be stretched thin with ∼100 states). the number of states each gaus-sianhmm was given per cluster was chosen by doing a modified version of the elbow method, to ensure that the hmm is complex/simple enough to match the variance in reported deaths. once these hmms were trained, we developed a method to initiate an hmm close to a particular starting emission and allow it to generate 14 subsequent emissions, to make 14-day predictions for the counties in the cluster. these predictions could be made thousands of times for each hmm, allowing each cluster to effectively create a probability distribution of predictions. further work needed to be done to make this stochastic model effective. as the model only allowed for predictions based off of an entire cluster, there needed to be a secondary layer of scaling to allow a cluster prediction to be mapped to each individual county. an idea of attempting to do some time-series comparison and do some sort of series stretching/scaling was developed, but was never fully formed. there are definite rooms for improvement and optimization in this model. one fundamental issue is that perhaps the small number of states of the gaussianmm limits the complexity of each hmm per cluster. while this is true, this level of simplicity mixed with stochasticity could capture the random element seen in the reported deaths data. if this random element is able to be removed or smoothed away,however, perhaps this model would not be so useful any longer. when attempting to use statistical regression to model nonlinear correlations in data, a common approach is to employ a bayesian non-parametric strategy, such as the gaussian process. bayesian non-parametric strategies like the gaussian process are essentially extensions of bayesian inference on an infinite-dimensional parameter space. very loosely, this allows us to model the data as the combination of many different gaussians (each quite accurate in its local region), stitched together to create a single model [12] . this technique was attempted in the latter stages of the course to create predictions for counties where the predictions from the pecaiqr did not converge, an issue at the time for counties with poor data. we used a custom implementation of gaussian processes, using a mean function of 0 and an exponential squared kernel. the data used was the rolling average of deaths over a three day window vs time. in this model, there are three parameters: l which is the length parameter for the kernel, σ f which is vertical variation parameter for the kernel, and σ y which is the noise parameter. for each county, the optimal hyperparameters for that county were found by searching for the set of parameters within a given range that minimized the error. here, the bounds on the parameters 5.0 ≤ l ≤ 15.0, 0.1 ≤ σ f ≤ m 500.0 , 0.1 ≤ σ ≤ m 10.0 where m was the maximum value of rolling three day average deaths over the past 14 days. the error function was the root mean squared error (rmse) over the last 30 days. these bounds and the error function were determined with validation procedures. the gaussian process seems to fit quite accurately in the short term, but the predictions quickly drop to zero, as demonstrated in the plots above. unlike the pecaiqr model, it does not retain a heavy tail. therefore, gaussian process does not have long term predictive power. we can possibly improve the gaussian process and overcome the issue by setting a custom mean instead of using the default zero mean, which likely contributes to the rapid decay of the tail. we did attempt to so, inspired by the pymc3 tutorial posted in the cs156b piazza [13] . it was attempted since a custom mean function could be used [14] ; similarities were noted between the graph of the number of deaths vs time in new york county and the weibull distribution, so the weibull distribution was this custom mean. however, it was not pursued further since it could not successfully run -when training the model on just one particular county, the program timed out and crashed jupyter on the computer it was run on. since the final deadline was already quite close, this attempt was abandoned. given how this model can fit to a custom mean, if it was attempted earlier it might have proved to be useful. we verify that the issue of a rapidly decaying tail is not specific to county 36061 as shown above. we can conclude that the gaussian process fits very well on the training data, but fails to present long term predictive power. again, these failures in the gaussian process model may be overcome by setting a custom mean. but we believe that the epidemiological model, though less accurate, has intrinsic advantages that the gaussian process cannot match. this is because the epidemiological model has some sort of intuition in the form of the rules established by its differential equations, while the gaussian process is purely curve fitting. perhaps, given more time, we could have combined these two models to utilize both the short term accuracy of the gaussian process and the long term predictive power of the epidemiological model. analysis of model predictions for different cutoff dates in the training data shows that the model is quite stable and consistent when predicting on the region past the peak. however, predicting before the peak is much harder, as we are no longer operating with the assumption that the infection curves are dying down. in sub-figure d) we see that the peak daily deaths value predicted by the model is significantly less than the actual peak that is revealed with more data. however, the location of the peak is correct. we realized that the training data at this early cutoff is largely dominated by data in the regime before the first stay at home order, but we were training using an initial parameter guess that accounted for the effects of a stay at home order, and so in sub-figure e), we tried a different set of parameters for the initial parameter guess in the fitting procedure, inspired from the parameters obtained from a fit on similar early curves in italian regions. this modification yielded a more accurate prediction of the peak daily deaths value, but a less accurate placement of the peak. the prediction curve becomes extended when we use this alternative parameter set because the model no longer assumes that there will be a stay at home order, and therefore the curve will not flatten to the same degree. analysis of model predictions for different cutoff dates in the training data shows that the model is quite stable and consistent when predicting on the region past the peak. one thing to note is that sub-figure a), trained on the most recent data, seems to have a much longer tail. this is a result of the high level of noise in the more recent data points, which are not included in the other cutoffs. predicting before the peak is much harder, as we are no longer operating with the assumption that the infection curves are dying down. in sub-figure d), we see that the fit is quite different from the fits with later cutoffs. the data before this cutoff is quite noisy, so the model cannot accurately predict when the infection curves will begin to die off. to fix this, in sub-figure e) we fit the model on active cases as well. the model is able to use the active case statistic to determine that the infection curve dies down earlier than it otherwise would predict. here, we fit each of the three counties using the policy regime method described in section 3.1 fitting. the faint gray curve shows the predicted infection curve on the data regime before the date of the stay at home order, plus the time of death, and the dark gray curve shows the predicted infection curve on the data regime after the stay at home order. in counties 36059 and 27053, the stay at home order seems to have flattened the curve, but in county 36061 the effects are more ambiguous. the is likely relates to factors specific to new york that worsened the outbreak, such as the high population density of new york city, and the shortage of hospital resources later on. the model can be easily adapted to train on international data. at the bare minimum, the model only requires death reporting and population. the predictions made from the may 4th cutoff, roughly a month before the june 3rd cutoff, shows that the prediction fit is quite stable and is consistent. (a) (b) (c) (d) (e) (f) figure 13 : the first column of figures shows daily deaths and currently hospitalized, daily hospitalized, and currently in icu plotted, respectively, plotted against time. the second column shows the linear regression for daily deaths vs currently hospitalized, daily deaths vs daily hospitalized, and daily deaths vs currently on ventilator, respectively. to smooth out the data, we used a moving average with a window of 7 data points. we then aligned the peaks for each hospital statistic to match the peak of the daily deaths, in order to account for the offset term that corresponds with the average time between hospital admittance and death. the first column shows the hospital statistics before they were aligned with the death statistic, and there is clearly an shift, although only by a few days. this makes sense, as patients who are admitted to the hospital are likely patients who have already developed a severe condition. note that the hospital statistics and the death statistics are both gaussian. this suggests that we can find some scaling factor once we align their peaks. for this, we perform a linear regression against the death statistic, revealing a definitive linear correlation for all the statistics. note that there is a slight nonlinearity in the regression between currently hospitalized and daily deaths, as the daily deaths increases beyond a certain point. this may indicate that we are nearing the hospital capacity, and so the change in number of currently hospitalized patients begins to lag behind the change in daily deaths. also note that there is an extreme outlier in the regression between currently on ventilator and daily deaths at the peak daily deaths value. similarly, this is also likely caused by some limit on the number of ventilators available. indeed, there seems to be a ceiling to the number of people on ventilators past a certain number of daily deaths. using the linear regression fit, we can now directly convert any of our model predictions, as well as their confidence intervals, to predictions for hospital resources. we noticed that the vast majority of counties have inconsistent reporting of deaths. some counties even had cumulative death statistics that decreased on certain dates. clearly this is not possible, as death is permanent. other counties report constant values for cumulative deaths (zero values for daily deaths) for an extended period of time, followed by a quick spike. it is doubtful that all deaths suddenly occur in a single day, so this observation suggests that the deaths reporting might not be distributed correctly, perhaps due to administrative lag. for counties with low numbers of deaths, this creates large amounts of variance in the data, which makes it hard to fit. it seems like the deaths reporting also seems to be correlated with the day of the week. for many counties there is an interesting trend that the daily deaths increase over consecutive days before the weekend. again, this might be due to administrative lag in the hospitals that report deaths. the active cases statistics also seem to be poorly recorded, and this makes sense given how ambiguous the classification of an active case can be, especially with the limitations on testing. many counties completely lack useful active case statistics, and in other cases, it is only available very late into the infection curve. in general, statistics that attempt to quantify the number of cases, whether active, cumulative, or daily, are intrinsically unreliable, as they depend on the availability of tests, which can vary greatly over time. perhaps a more reliable statistic is the ratio of positive tests to administered test on any given day. another issue was with the non time-series data. some files, such as the age race file from [15] and the aggregate jhu file from [16] , had valuable information but also numerous holes. while this data was not used in creating the predictions, it was used heavily in section 11 creative data visualizations which details an attempt on clustering based on categories in these datasets. such holes proved difficult to fill in. the filling in was done by using the data from the nearest county if a given entry was missing for a certain county. however this could be ineffective if this certain county and its nearest neighbor are very different in nature, as then the data for this certain county would not be particularly representative. while the data holes were not particularly impactful for the models described above, they certainly could be for models that took many non time-series features from those datasets into consideration. a contribution to the mathematical theory of epidemics unique epidemiological and clinical features of the emerging 2019 novel coronavirus pneumonia (covid-19) implicate special control measures positive rt-pcr test results in patients recovered from covid-19 table? q= american\ %20community\ %20survey\ %20\ %28acs\ %29\ %202018\ &tid= acsdp1y2018. dp05\ &y= 201\ 8& vintage= key: cord-018752-7jmnwpq6 authors: medina, marie-jo title: pandemic influenza planning for the mental health security of survivors of mass deaths date: 2016-02-12 journal: exploring the security landscape: non-traditional security challenges doi: 10.1007/978-3-319-27914-5_5 sha: doc_id: 18752 cord_uid: 7jmnwpq6 influenza a pandemics have been documented to occur at 10to 50-year intervals—an average of three events per century, dating back from the 16th century. each recorded pandemic has resulted in an increase in annual mortality rates in the infected population, with mass deaths in one pandemic wave equalling fatalities sustained over six months of an epidemic season. this chapter aims to rectify the oversight in pandemic preparedness plans by presenting a compendium of guidelines and recommendations by international health organisations, pandemic fatality experts, and experienced mass death management professionals. its objective is to have available a mass fatality framework to complement the who pandemic influenza preparedness and response (2009) guideline, from which individual national pandemic preparedness plans are based. it is written in a format that incorporates who’s emphasis on finding the ethical balance between human rights and successful plan implementation; the assimilation of national pandemic plans with existing national emergency measures; and the ‘whole group’ system of engaging individuals, families, localities, and business establishments in the process. this chapter is also written such that it can be made applicable to analogous infectious disease outbreaks such as sars and ebola, as well as comparable mass fatality events. influenza a pandemics have been documented to occur at 10-to 50-year intervals -an average of three events per century, dating back from the 16th century (kasowski 2011; taubenberger 2006; who 2005) . each recorded pandemic has resulted in an increase in annual mortality rates in the infected population, with mass deaths in one pandemic wave equalling fatalities sustained over six months of an epidemic season (hardin 2009 ). the three pandemics in the 20th century occurred in 1918, 1957, and 1968 . the latter two have been estimated to have resulted in increased deaths totalling up to four million in people in at-risk groups worldwide, while the former resulted in the mass deaths of approximately 40 million in the otherwise healthy groups (hardin 2009; kasowski 2011; taubenberger 2006; who 2005) . the 1918 pandemic remains the most fatal pandemic in history; a novel influenza subtype of equivalent virulence is anticipated to result in deaths in approximately 2 % of the current global population (ibid). there has so far been one pandemic this 21st century, caused by the h1n1 influenza subtype in 2009. although its attack rate was characterised as mild, it nonetheless resulted in the global deaths of up to 575,400 people who would not have otherwise perished at that time (dawood 2012) . approximately 80 % of the fatalities were in populations younger than those who generally decease during influenza epidemics, and the burden was most pronounced in the poorer african and southeast asian countries (ibid). in 1999, who published a guidance on pandemic influenza preparedness as a framework for who member-nations, in their attempts to develop a plan against the risk of the occurrence of an influenza pandemic, and to introduce the six phases in the declaration of a pandemic (who 2005) . in 2005, improvements to the guidance were incorporated in keeping with the international health regulations (ihr). in 2009, further revisions were made to consolidate developments that have transpired since the enactment of the 2005 framework (who 2009). pertinent to this discourse is the revision accentuating the prevailing of ethical principles when finding a balance between human rights and successful pandemic plan implementation. upholding ethical principles include respecting both the dead and the bereaved throughout the course of the event (morgan 2006 (morgan , 2009 ; handling and disposing of bodies in a dignified manner; and respecting cultural and religious conventions (ibid). further, it encompasses the acknowledgement of the diversified vulnerabilities and capabilities of individuals and groups, so that nobody experiences marginalisation and disavowal of support (sphere 2004) . vulnerabilities may be physical, such as: gender; age; physical or mental impairment; and hiv/aids status. they may also be social, including: ethnicity; religious affiliation; political leanings; and residency status (ibid). published literature in psychology suggests that disasters can induce mental illnesses among survivors (bonanno 2010; gibbs 2003) . the most often affiliated mental health illness in disasters is posttraumatic stress disorder (ptsd). however, several individual symptoms, as well as syndromes, have also been associated with the trauma, albeit not given a specific name (ibid) . some research promote that the amount of trauma sustained in a disaster is directly proportional to the severity of the psychological illness. others assert, on the other hand, that ancillary factors may also contribute to mental health risks. these may include the specific context with which the survivor identifies with the disaster; the emotional and physical distance an individual has from the situation; and the quality and accessibility of the support available (ibid). further, there are those who argue that ptsd may be overly estimated; while other, less characterised, symptoms are under-estimated (bonanno 2010) . this dubiousness in the literature has been attributed to the difficulty encountered in assessing psychological consequences sustained in disasters, because of the chaotic nature of the event; and because of the methodological impediments to psychoanalysis (ibid). to provide a more cohesive portrait of 'typical' mental health illnesses following a disaster, george bonanno and colleagues (bonanno 2010) compiled data from high quality research and summarised their findings in five categories. the first category relates to the severity of mental illness brought on by disaster. it was determined that, although consequences of trauma from disasters may range from grief and ptsd to depression and suicidal tendencies, more extreme presentations of the disease have only been observed in a small number of cases. in adults, this accounts for only 30 % of all subjects studied. in youths, acute symptoms in the initial aftermath tend to be severe; however, chronic symptoms tend to be more similar in the adults, not exceeding 30 %. the second category pertains to differences in psychological outcomes and resilience. it is suggested that some survivors overcome the traumas within two years post-disaster; while the more resilient only experience transient symptoms and recover fairly quickly. the third refers to the factors relating to outcomes, already alluded to above, and theorises that there is no single predictor of outcome. this is because individuals have different risk factors for mental health illness, as well as varied mechanisms for coping with trauma. the penultimate category specifies the risk to interpersonal and community relationships. it acknowledges that, although some affiliations are made stronger by shared traumatic experiences, several indicators suggest that most relationships actually do not survive the experience. incidentally, the status of their post-traumatic interpersonal relationships also influences their coping mechanisms. finally, in examining the mental health effects to populations located at a distance from the disaster scene, it has been determined that transient grief may be experienced by these individuals; however, psychological disorders may only be recognisable in those with prior experience in disasters, including those who lost loved ones under similar circumstances (bonanno 2010) . lastly, literature suggests that the emotional and psychological traumas among survivors of multiple deaths are compounded when the bodies of their loved ones are not processed with care; this is true irrespective of the age, race, or nationality of the deceased (gibbs 2003; morgan 2006) . poorly managed deaths therefore, present a perceivable global mental health risk. however, despite the globally acknowledged increase in deaths due to infection with novel influenza a subtypes, and all that is recognised about risks to mental health security in mass fatalities, pandemic preparedness plans remain disproportionately focused on preventing the manifestation of a pandemic and on mitigating morbidities and mortalities, rather than equally addressing mass fatality management preparedness plans. mass fatality management preparedness planning is paramount in any influenza pandemic preparedness plan if business continuity is to be expediently achieved, and survivor grief and psychological trauma can be mitigated through the honourable and respectful handling of the remains of the dead. this chapter aims to rectify the oversight in pandemic preparedness plans by presenting a compendium of guidelines and recommendations by international health organisations; pandemic fatality experts; and experienced mass death management professionals. its objective is to have available a mass fatality framework to complement the 2009 who pandemic influenza preparedness and response guideline, from which individual national pandemic preparedness plans are based. it is written in a format that incorporates who's emphasis on the assimilation of national pandemic plans with existing national emergency measures; the 'whole group' system of engaging individuals, families, localities, and business establishments in the process; and on finding the ethical balance between human rights and successful plan implementation. sources for the guidelines include: 1. hardin and ahrens (2009) (hardin hereafter) authored a chapter specific to influenza pandemic mass fatality management. it delineates the facts from the myths and provides a guideline for mass fatality planning. 2. the integrated regional information networks (2012) (irin), whose purposes are to promote the understanding of regional affairs; to advocate competent humanitarian response; and to advance knowledge-based media reporting. fatalities seminar summary report' (2011) (homeland hereafter). this report focused on the lessons learned by multiple sectors, based on their experiences with mass fatality response. 4. oliver morgan's 'management of dead bodies after disasters: a field manual for first responders,' (2009) (morgan henceforth) whose aims are to advocate decent and respectful dead body management; and to increase the likelihood of a successful victim identification. 5. the sphere project: humanitarian charter and minimum standards in humanitarian response (sphere hereafter). it developed the 'universal minimum standards' in humanitarian aid, based on the cumulative experiences of disaster teams and agencies. 6. the uk home office 'guidance on dealing with fatalities in emergencies' (home office henceforward). this is a joint publication of the uk home office and cabinet office, from which was based the london 2010 olympics pandemic plan, the most successful olympics yet. this chapter is written such that it can be made applicable to analogous infectious disease outbreaks such as sars and ebola, as well as comparable mass fatality events. mass fatality is defined as an event where the number of the dead exceeds available local capacities for appropriate management of human remains (morgan 2006; ralph 2015) . they may ensue from natural or man-made disasters, or infectious disease pandemics. mass fatality management planning is highly relevant because of the psychological effects improper handling of dead bodies can have on the survivors (ibid); and because initial stages of fatality management will determine the final outcome in the unequivocal identification of dead bodies, and the subsequent return of their remains to the rightful relatives (ibid). the survivors' utmost desire, in disasters, is to unequivocally ascertain the circumstances of their missing loved ones (morgan 2009 ). however, this desire may run contra-parallel to the disaster teams' priority-mitigating further consequences of the event (ibid). a balance between practicality and empathy would therefore, need to be established. formulating preparedness plans is made difficult by the necessity of predicting scenarios for which the plans can be rationally devised. undoubtedly, human imagination will fail to predict every possible scenario, and the disaster that eventually unfolds will be one too unbelievable to conceptualise. nonetheless, it is imperative that certain assumptions are made, if only to provide planners with a point of reference. when developing pandemic plans, hardin and ahrens (2009) suggest five assumptions that would be invaluable. they are: 1. the local community would need to be able to support itself, particularly during a pandemic, when similar events are simultaneously occurring elsewhere, and aid will tend to be diffused. 2. funeral homes will be rapidly overwhelmed. 3. resourcefulness will be needed in acquiring inventory essential for body management. 4. funeral and memorial practices may need to be altered to ensure the expeditious processing of bodies. 5. friends and family from near and far will be desperate for information. chaos is the immediate aftermath of a disaster (morgan 2009 ). therefore, a coordinated plan put into operation as soon as practicable will be invaluable in managing the disaster area. it is likely that local emergency personnel will be first at the scene, and will already have coordinated disaster plans in operation (ibid). however, it is important to note that stakeholders, leadership structure and operational procedures in pandemic planning may differ from these and other mass fatality plans (hardin 2009; morgan 2009 ). hence, it is essential that: (a) a comprehensive list of stakeholders is included in the plan. these may include: 1. emergency management teams 2. public health authorities 3. medical and veterinary teams 4. medical examiners and coroners 5. police 6. death registry 7. funeral directors 8. cemetery and crematorium administrators 9. legal professionals 10. religious officials and community support groups 11. schools 12. social well-being advisers 13. mental health professionals (b) establish a structure of leadership, with absolute authority ascribed to the entity presiding over the management of the dead. a flowchart with names, responsibilities and emergency contact numbers will be beneficial. (c) specify each stakeholder's duties and responsibilities. provide timelines and benchmarks for the successful completion of each task. (d) coordinate resources. a system of real-time stock-taking will be beneficial in the sharing and distribution of essential goods and services. stipulate how reimbursement for the use of shared resources will be managed, including realistic timelines for monetary disbursement. (e) coordinate with regional and national fatality management plans. their resources and expertise will be of considerable value, particularly in matters relating to funeral homes, mass communication, logistics, and national and international jurisprudence and aid. (f) coordinate with international aid organisations. they have the experience, expertise and resources to respond on short notice. coordinating resources beforehand (in 1(d) above) should prevent stockpiling of necessities with shortened expiration dates that may later go to waste. it is suggested that funeral directors have stock in circulation that is proportionate to a six-month supply for standard operations, the assumed length of the first pandemic wave. it is necessary to note that (hardin 2009; irin 2012; morgan 2009 ): (a) embalming fluids tend to have a protracted shelf life. (b) affordable caskets will be in great demand, particularly in instances when death occurs in more than one family member. (c) cremations will require large amounts of fuel. copious amounts of information are compiled on the dead and missing, regardless of the size of the disaster. appropriate management of all information will require human and technical expertise, which may be beyond the capabilities of local communities. regional authorities are more likely to have trained personnel and modernistic technologies, and may therefore, be best placed to take the lead in information management (homeland 2011; morgan 2009 ). mass media are indispensable in communicating with a wide audience during a disaster, and both amateur and seasoned journalists will be among the first at the scene. however, the content of the information they provide as well as the manner in which they dispense their knowledge of the scene may induce stress and anxiety among the survivors. therefore, it is paramount that members of the press be given every possible opportunity to communicate responsibly and to the best of their abilities (homeland 2011; morgan 2009: 19) . effective information management reduces stress and anxiety among survivors, and augments efforts in successfully recovering remains and identifying the dead. listed below are the matters that need to be considered (homeland 2011; morgan 2009): (a) coordinating information 1. information hubs need a local and regional presence and should be established in the first instance. 2. determine who would need to be informed, and what the best method of communication would be, to ensure that information reaches as much of the appropriate target groups as possible. 3. local centres are best for collecting and providing information on the dead and the missing, and for relaying information on the immediate needs of the grieved. 4. impose upon humanitarian and aid agencies since they will have first-hand knowledge of the state of the scene, and the kind of support the survivors will need. 5. all information needs to be centralised and synchronised for accuracy, and for promoting the successful tracking of the dead and missing. (b) the information 1. foremost is the protection of the privacy of those afflicted and their families. 2. take advantage of already established methods of gathering information (e.g. surveillance networks; automatic alert systems). ascertain whether expanding the scope of these systems will be beneficial and can be implemented rapidly. 3. use a template that covers all the essential information, and that could easily be updated. this would include what is being done; what is known; what is yet to be determined; and where further information will be provided when they become available. 4. an informed decision needs to be taken on when it would be appropriate to report the number of dead, missing and displaced. too soon, and the numbers are likely to be grossly inaccurate; too late, and the media could be disposed towards exaggeration. 5. information on the system of search and rescue, and body retrieval, identification, interment and disposal must be provided. 6. photographs and other identifying information should only be released to the media if it has been determined that doing so would enhance the identification process. (c) the media 1. designate a representative with whom the media may liaise. 2. install an office specific for media relations, preferably as close to the scene as possible. 3. provide journalists with accurate, confirmable, and up-to-date information as close to real time as practicable, to advance factual reporting and mitigate rumour-mongering. this may be facilitated through regular press briefings or short interviews. 4. social media is a double-edged sword. knowledge will be available immediately and in real-time; however, the material will tend to be unedited and prone to bias. if not managed appropriately, it may disrupt fatality plans already in progress. (d) the public 1. determine the most appropriate method of providing information to different age groups and social, cultural and economic strata, to avoid marginalisation. 2. circulate concise information on what procedures need to be adhered to, immediately following a disaster. 3. vigilance in social media trends is essential. (e) the survivors 1. impress upon survivors that help is available. enumerate what support can and cannot be provided, and where they need to go to receive the specific aid they need. 2. provide an emergency contact number strictly for the relatives of the missing and the dead. 3. provide specific information on where relatives need to go and what documents they would need to bring, to facilitate the efficient and expeditious management of enquiries. 4. specify the process for procuring a death certificate, so that they may be able to make legal and funeral arrangements. (f) the humanitarians 1. ensure that humanitarian and aid agencies are provided with accurate information, particularly in regard to the risks from dead bodies, and that they themselves are sharing accurate information to those at the scene. 2. relief agencies such as the international committee of the red cross may be able to help trace missing persons, if given sufficient information. (g) the dead bodies 1. standard pro forma containing basic information should be completed for all bodies. 2. in the absence of an electronic system of data-gathering, hand-written forms may be used. however, extreme care would be needed in writing and in the subsequent transfer onto an electronic format. 3. all manner of original forms must be readily available, should data confirmation be necessary. 4. all items of a personal nature, including photographs, may be included in the database. 5. all information must be accompanied by a chain-of-custody. in the early stages of a pandemic, scientific intelligence gathered through already established surveillance systems would need to be rapidly apprised of the nature of the virus and the manner of death, through the investigation of the index case. it is recommended that the role of investigator be entrusted to the jurisdictional medical examiner or coroner (me/c) in two capacities (hardin 2009 ): 1. limited jurisdiction over the dead body in cases when: (i) death fits the profile for an emerging disease that needs laboratory confirmation from body fluids and tissues. (ii) death of a poultry worker from influenza-like illness (ili). (iii) death from ili of family members or contacts of poultry workers. (iv) death due to recent travel to a country where pandemic flu strain is circulating. (v) first death case in a hospital, requiring tissue samples for virus characterisation. (i) there is no listed attending physician. (ii) the deceased is unknown and decedents have not been found. (iii) sudden deaths and fatalities uncharacteristic of those due to a flu virus. (iv) death of incarcerated persons. (v) it is essential to public health. (b) search for the missing death from pandemic influenza generally occurs at home or in group care facilities. in the event that an exceedingly virulent pandemic strain also kills its victims with haste, more will be unable to seek hospital admissions prior to death (hardin 2009 (ii) numbering and photographing the dead (or body parts for non-intact bodies). (iii) a mechanism for immediate confirmation of death by me/c. existing laws may need to broaden the stipulations on who has legal powers to pronounce death. (iv) record the date, time and place of death, as well as the testifier's name and contact information, and their affiliated organisation's name and address. 2. in the community (hardin 2009) (i) designate a phone number for the missing persons' hub where inquiries can be made about the well-being of certain individuals. this hub must be interfaced with hospital and healthcare centre systems of admissions and discharges, and with me/cand death registry logs. (ii) there must be a system for the regular advertisement of the hub number in several mass media formats. (iii) it is essential that the hub's database be unrestrictedly shared with the police and emergency missing persons' divisions. (c) recovery and transport of bodies dead body management begins when the remains of the deceased are being recovered (morgan 2009 ). recovery commences immediately after searching of the scene has been completed (ralph 2015) . it could last for days or weeks, but may be protracted in more severe disasters (morgan 2009 ). its priority is the rapid location and retrieval of bodies or body parts, and the deceased's personal effects. speed in recovery aids in identifying the dead; reducing the psychological impact on survivors; and diminishing the distress often associated with the image and odour of death (irin 2012; morgan 2009). the recovery scene is often chaotic and uncoordinated because there is an abundance of groups and individuals trying to help, including locals; aid agencies; and military and civilian search and rescue operatives (morgan 2009 ). in order that body recovery does not impede the simultaneous assistance offered to survivors, the following should be considered (hardin 2009 (i) use of photographic equipment and standard documentation materials such as body tags with unique references. documenting the exact place and date of recovery would augment the identification process. (ii) impermeable body bags are ideal for recovery, and double-bagging is preferential; however, sheets of any material may be used if nothing else is at hand. each body part must be collected in separate bags and no attempts must be made to match them at the scene. (iii) personal items ought not to be separated from the owner, and all documentation must remain with the body. (iv) establish two teams: one to take bodies to a holding area prior to delivery; the other to deliver them for either immediate identification or temporary storage for subsequent identification. (v) the holding area will have rapid turn-over. hence, it is best situated within close proximity of the scene; preferably stretched across the inner scene cordon. the holding area is a private and secured space where documents can be cross-checked and evaluated for completeness. at no point must this area be used as a mortuary; a facility for victim identification; or as a temporary storage facility. (vi) transport can be achieved by using the body bags or sheets with which they are covered, or by trucks and trailers; however under no circumstances must ambulances be used, as the living are best served by them. 4. disaster areas may be hazardous. it is paramount that recovery teams not be exposed to undue risks in performing already stress-filled tasks. risk assessments are requisite and basic health and safety measures must be in place (home office 2006; morgan 2009). (i) ventilate enclosed spaces before attempting recovery. (ii) at the minimum, protective clothing would include disposable biohazard suits; sturdy boots and durable gloves. face masks may be provided, if only to alleviate anxiety from odours and from fear of aerosol infections. (iii) personnel need appropriate training in donning, doffing and decontaminating protective equipment. (iv) a mechanism of hand-washing, disinfection and decontamination should be available. (v) first aid and emergency treatments will be needed on-site. (vi) the need for vaccination and prophylaxis would have to be evaluated. mass fatalities are expected to overwhelm local surge capacities which will invariably result in delays in victim identification. further identification delays can result from the logistics of assembling a forensics team, which can take weeks; and from natural decomposition. places in hot climates are especially vulnerable to decomposition, resulting in bodies becoming unrecognisable within 12-48 h. to maximise every opportunity of successfully identifying bodies, temporary storage facilities are compulsory. these can be in the form of cold storage or transitory interment (hardin 2009; irin 2012; morgan 2009; sphere 2004) . it is imperative that bodies or body parts are stored in the bags or sheets in which they were recovered and that their associated unique identifying tags are written on water-impermeable labels, rather than on the bodies or bags themselves (ibid). (iv) shortage of refrigerated storage at the scene is to be expected. establish a back-up plan until more coolers become available. (v) dry ice may be used in the interim 1. overlaying dead bodies with dry ice creates forensic artefacts, and should therefore, be avoided. instead around small groups of bodies, construct a wall of dry ice approximately 0.5 m in height, and secured with durable plastic sheeting. 2. ventilate areas where dry ice is in use. (vi) the use of ice is impractical and problematic. 1. a large inventory is required, particularly in instances when rapid melting occurs. 2. melted run-offs may pose concerns about diarrheal infections. 3. appropriate disposal of ice water will complicate management plans. 4. water may distort bodies and destroy personal properties. 2. interment is the burial of bodies underground when there are no other alternatives, and when temporary storage is needed for longer periods. (i) efficient disinterment will be aided by proper grave construction. 1. use a familiar and protected plot of land. 2. bury bodies individually if at all possible. otherwise, use trenches. 3. local practices may dictate how bodies are positioned (e.g.: facing mecca). 4. burials should only have one level; be at least 1.5 m in depth; and have parallel spaces 0.4 m in between bodies. 5. bottoms of graves with less than 5 occupants should be at least 1.2 m away from ground water. this space should be increased to at least 1.5 m if buried in sand, and at least 2 m if many more bodies are interred. 6. tag each body, and record their positions above the grave. use of gps systems will be invaluable. (ii) selecting burial sites 1. assess soil characteristics, height of water table, and available tracts of land. 2. situate in land acceptable to local communities. 3. establish in areas easily accessible to mourners. 4. sites should be at a distance of at least 10 m from developed land, and 200 m from sources of water, depending on local topographical conditions. (iii) unceremonious burial in mass graves does not satisfy any public health interests; is socially unacceptable; and may waste inventory. (iv) avoid rushed and unmannerly cremations. (v) it is disrespectful to gather the dead using backhoes, diggers, or bulldozers. (vi) sphere international standards mandate that: 1. bodies are disposed of with dignity 2. cultural and religious practices be honoured 3. public health practices be upheld. (vii) where burial is inconceivable due to frozen tracts of land or lack of solid ground, it may be necessary to store bodies for the duration of a pandemic wave. (viii) survivors are more likely to spread infectious diseases than dead bodies, except in cases where diarrheal diseases and haemorrhagic fevers are indicated. 1. tuberculosis, hepatitis b and c, and diarrhoeal diseases can survive for up to 2 days in dead bodies. 2. hiv may survive for up to 6 days. establishing the identity of the deceased is the second major function of incident response teams, following search and recovery, and is generally the responsibility of the me/c (ralph 2015) . identification is accomplished by making a match between the information collected about the deceased, and the information documented on the missing and presumed dead (morgan 2009 ). the sooner a positive id is accomplished, the better for the relatives waiting to bury their dead and to go through the legal procedures (ibid). visual identification through decedent recognition or photography is the most basic method of identification (home office 2006; morgan 2009; ralph 2015) . however, mistaken identity is common with this practice, particularly when the dead is soiled or already decomposed (ibid). further, viewing multiple dead bodies may have psychological effects on the witness, thereby diminishing the legitimacy of the identification. errors in identification cause embarrassment to all involved; distress to the relatives; and difficulties with legal issues (morgan 2009 ). therefore, forensic methods would also need to be employed. the success of forensic identification is enhanced by the initiative and diligence of the death management team (ibid). identification is carried out in the morgue, where the cause and manner of death are also determined. the me/c determines where the incident morgue is eventually established (hardin 2009; home office 2006; ralph 2015) ; it may be that a temporary facility is constructed, or that an already existing structure is expanded to accommodate the surge (ibid). the benefits and drawbacks of each type of facility would need to be judiciously considered (ibid). 1. things to consider: (i) determine how soon temporary mortuaries can be commissioned for use, compared to how quickly expanded space in already built mortuaries can be made available in disasters. (ii) commissioning time will have a direct impact on body recovery, storage, and transport. (iii) temporary facilities need to be operational as soon as 24 h post-disaster. (iv) the use of previously functional morgues may mean that storage already contain bodies; hence, surge capacity will be unknown until such time as the disaster occurs. (v) the disaster scene will be instrumental in determining the necessity of constructing temporary facilities. information on the projected number of afflicted; the disposition of the dead; and the estimated 4. autopsies are not needed to confirm death caused by influenza. however, if they are performed, some guidelines apply: (i) in the interests of public health, respiratory tract and tissue samples for laboratory analyses may be collected. (ii) liaising with public health laboratories on the current guidelines for collecting and transporting influenza specimens will save time and effort. (iii) next-of-kin will generally need to give permission for the autopsy to be performed in a hospital. (iv) me/cs do not need permission if the autopsy is within their remit. (i) release dead bodies only when a definitive identification has been made. (ii) expedited release may be necessary where cultural or religious customs are indicated. (iii) some laws stipulate who has the authority to perform this task. (iv) the name and contact details of the claimants need to be collected and filed along with other documents associated with the body. (v) unidentified bodies, foreign nationals, undocumented migrants, and homeless persons need to be stored or interred for further identification at a later time. (vi) release of bodies with missing parts may later impede the management of severed body parts. to minimise complications, family members' wishes regarding future identification of other body parts should be documented. choices may include: 1. to postpone body release until all body parts have been found. 2. to proceed with the funeral but be apprised of other parts that are later found. 3. to proceed with the funeral and consider the matter closed. (vi) a death certificate is provided with the release of the body. 6. death certificates (i) the death certificate is a legal document; hence, the law stipulates the signatory on the certificate. (ii) the document specifies the cause and manner of death; where death occurred; when it was pronounced; and the name and contact details of the signatory. (iii) in pandemics, it is essential that hospitals and care facilities assign this task to specific individuals in order to mitigate chaos. (iv) funeral directors with policies against collecting bodies unaccompanied by a certificate of death need to allow for flexibility during pandemics. 1. this should be addressed in the planning stages. 2. all stakeholders must be in agreement. (b) funeral homes and crematory operations funeral directors are responsible for the recovery and transport of dead bodies; preservation of the integrity of the chain-of-custody; and assistance in disposal of the remains. although they are not qualified grief counsellors, they are nonetheless tasked with conversing with individuals on the most discomfiting day of their lives. this therefore, also makes them the best people to facilitate the return of the dead to their bereaved relatives (homeland 2011). once a body has been released to the decedents, it is generally their responsibility to contact the funeral director of their choosing, for the transport of bodies to funeral homes and the subsequent burial or cremation, according to their culture or religious beliefs (hardin 2009; homeland 2011; irin 2012; morgan 2009; sphere 2004) . pandemics could result in funeral homes overseeing 6 months' worth of dead bodies within a 6-8 week period (hardin 2009; homeland 2011; irin 2012; morgan 2009 ). therefore, it may be prudent for individual funeral homes to plan for employing more trained personnel who can be available on short notice (ibid). (i) funeral directors will be responding to requests from families to transport bodies to funeral homes, and from me/cs to provide conveyance to mortuaries or storage facilities. plans for the inclusion of more licensed funeral directors and transport services is therefore essential. (ii) safeguard lawful body transport by ensuring that funeral home personnel are licensed and trained in recovery and transport, and that their vehicles are approved and registered for carrying dead bodies. (i) burials are more practicable in disasters, because they enable future identification of persons yet unknown. (ii) it is not good practice to cremate the remains of unidentified bodies. 1. there is no public health benefit in cremating those who die of influenza. 2. cremation will not allow identification in future. (iii) cremating one body takes 4 h and produces 3 to 6 pounds of ash and partially incinerated body parts; thereby, creating logistical difficulties when the number of bodies rapidly mount. (iii) ensure the area is secure and private. (iv) it needs to be accessible for 24 h within the first 3 days, after which it can be scaled down to operate for 14-16 h a day. (v) anticipate approximately 10 kinsperson for every victim and plan accordingly. (vi) multiple facs may be necessary, but movement of families from one area to another must be avoided; instead, fac personnel should go to where the survivors are situated. (vii) facilities must be scalable. 3. support and assistance (i) prioritising the needs of the vulnerable. (ii) personal and private meetings with family members as soon as practicable to initiate the collection of ante mortem information for the mortuary. (iii) system for reporting and providing information on the missing. (iv) emotional and psycho-social support for survivors befitting their needs, culture, and the context of the disaster. (v) systematic, up-to-the-minute information on the missing and the dead. families ought to be the first informed of the status of their loved ones. (i) each support agency within facs needs a command post; a separate area for staff preparation and duty operation; and the capability to deploy staff to fac. (ii) the nature of the disaster will determine which agencies are involved. frequently in force are family assistance services; mental health assistance; and child agencies. (iii) aid agencies and faith groups may be present. (iv) fac staff must be vetted. (v) flexibility is essential in order to accommodate the changing needs of the families as time progresses. based on the history of influenza a pandemics, this century may be due for, at most, two more pandemics. if even one of them is as deadly as that of 1918, then approximately 2 % of the global population will die. however, even if the future 21st century pandemics are atypically mild as that of 2009, still many more people will die than normally would. the who provided a framework for influenza pandemic preparedness planning. however, its focus is skewed towards the prevention of the event from happening, and a bit remiss on planning for the management of the surge in deaths. having a fatality management plan incorporated in pandemic plans is relevant because mishandling of dead bodies is a mental health risk for their loved ones. mass fatalities may ensue from natural or man-made disasters, or infectious disease pandemics. regardless of how they may transpire, conflict will invariably come to pass between the fatality management team, and the surviving relatives of the missing and the dead. conflict is inevitable, because each group contextualises the event from different perspectives; fatality management personnel perceive the event as something that needs immediate oversight, in order that they may mitigate further calamitous consequences; survivors, on the other hand, are more single-minded in their overwhelming desire to determine the circumstances of their missing loved ones (morgan 2009 ). however fatality management ultimately eventuates, respect; sympathy; and caring are due the dead and their relatives throughout the event (ibid). weighing the costs of disaster: consequences, risk, and resilience in individuals, families, and communities estimated global mortality associated with the first 12 months of 2009 pandemic influenza a h1n1 virus circulation: a modelling study disasters, a psychological perspective ed) pandemic influenza emergency planning and community preparedness managing mass fatalities seminar: a summary report guidance on dealing with fatalities in emergencies influenza pandemic epidemiologic and virologic diversity: reminding ourselves of the possibilities mass fatality management following the south asian tsunami disaster: case studies in thailand ) management of dead bodies after disasters: a field manual for first responders mass fatality management', disaster resource guide analysis: why dead body management matters', irin humanitarian news and analysis influenza: the mother of all pandemics who global influenza preparedness plan: the role of who and recommendations for national measures before and during pandemics pandemic influenza preparedness and response: a who guidance document time of recovering their remains, all need to be considered. (vi) in the event that a pandemic is caused by a cbrn attack, the mortuary will be fundamental in criminal investigations; hence, standard operating procedures must be such that substantiation does not fail under legal scrutiny. family assistance is one of the most sensitive undertaking in mass fatality management. family assistance centers (fac) are generally established near mass fatality scenes, where survivors can congregate to wait to hear about the status of their missing, and to receive much-needed support (homeland 2011; morgan 2009; ralph 2015) . facs are secure, private, and multi-sectorial, so that all the support and assistance needed can be provided under one facility (ibid). things to be considered in establishing facs include:1. function of fac (i) to provide families with information on their missing and dead. (ii) to provide shelter from media intrusion and from the newsmongers. (iii) to enable investigators and me/cs to gather information from families about the missing and the deceased. (i) situate facs near the disaster scene, where ingress and egress can easily flow. (ii) avoid locating facs near the morgue. key: cord-028337-md9om47x authors: ketcham, scott w.; sedhai, yub raj; miller, h. catherine; bolig, thomas c.; ludwig, amy; co, ivan; claar, dru; mcsparron, jakob i.; prescott, hallie c.; sjoding, michael w. title: causes and characteristics of death in patients with acute hypoxemic respiratory failure and acute respiratory distress syndrome: a retrospective cohort study date: 2020-07-03 journal: crit care doi: 10.1186/s13054-020-03108-w sha: doc_id: 28337 cord_uid: md9om47x background: acute hypoxemic respiratory failure (ahrf) and acute respiratory distress syndrome (ards) are associated with high in-hospital mortality. however, in cohorts of ards patients from the 1990s, patients more commonly died from sepsis or multi-organ failure rather than refractory hypoxemia. given increased attention to lung-protective ventilation and sepsis treatment in the past 25 years, we hypothesized that causes of death may be different among contemporary cohorts. these differences may provide clinicians with insight into targets for future therapeutic interventions. methods: we identified adult patients hospitalized at a single tertiary care center (2016–2017) with ahrf, defined as pao(2)/fio(2) ≤ 300 while receiving invasive mechanical ventilation for > 12 h, who died during hospitalization. ards was adjudicated by multiple physicians using the berlin definition. separate abstractors blinded to ards status collected data on organ dysfunction and withdrawal of life support using a standardized tool. the primary cause of death was defined as the organ system that most directly contributed to death or withdrawal of life support. results: we identified 385 decedents with ahrf, of whom 127 (33%) had ards. the most common primary causes of death were sepsis (26%), pulmonary dysfunction (22%), and neurologic dysfunction (19%). multi-organ failure was present in 70% at time of death, most commonly due to sepsis (50% of all patients), and 70% were on significant respiratory support at the time of death. only 2% of patients had insupportable oxygenation or ventilation. eighty-five percent died following withdrawal of life support. patients with ards more often had pulmonary dysfunction as the primary cause of death (28% vs 19%; p = 0.04) and were also more likely to die while requiring significant respiratory support (82% vs 64%; p < 0.01). conclusions: in this contemporary cohort of patients with ahrf, the most common primary causes of death were sepsis and pulmonary dysfunction, but few patients had insupportable oxygenation or ventilation. the vast majority of deaths occurred after withdrawal of life support. ards patients were more likely to have pulmonary dysfunction as the primary cause of death and die while requiring significant respiratory support compared to patients without ards. conclusions: in this contemporary cohort of patients with ahrf, the most common primary causes of death were sepsis and pulmonary dysfunction, but few patients had insupportable oxygenation or ventilation. the vast majority of deaths occurred after withdrawal of life support. ards patients were more likely to have pulmonary dysfunction as the primary cause of death and die while requiring significant respiratory support compared to patients without ards. keywords: acute respiratory distress syndrome, acute hypoxemic respiratory failure, mortality, cause of death background acute hypoxemic respiratory failure (ahrf) is among the most common causes of critical illness, with a hospital mortality of approximately 30% [1] . in patients meeting the definition of acute respiratory distress syndrome (ards), mortality is approximately 40% [2] . however, while ahrf and ards are each defined by severe hypoxemia and associated with high mortality, death due to refractory hypoxemia is reportedly rare. in cohorts of ards patients treated in the 1990s, only 13-19% of deaths were due to refractory hypoxemia, while deaths due to multi-organ failure from sepsis were the cause of up to 50% of deaths [3] . these findings suggested that therapies focused on reducing the complications of sepsis would have a greater impact at improving ards survival than therapies for severe hypoxia. since the 1990s, however, cause of death specifically related to organ system dysfunction has not been described despite substantial evolution in critical care practices. ventilator management now focuses on minimizing ventilatorinduced lung injury, as opposed to normalizing oxygenation and ventilation [4] , which may have led to further reduction in death due to refractory hypoxemia. in addition, there has been growing attention to minimization of sedation, early mobilization, and sepsis recognition and treatment, the latter of which may mitigate mortality due to sepsis [5] [6] [7] [8] . finally, there has been an increased focus on palliative care in the intensive care unit (icu), which may lead to earlier treatment limitations [9] [10] [11] . because of these changes in practice and how they may affect causes of death in the icu, we hypothesized that causes of death among ahrf and ards patients may be different from historical cohorts. an updated understanding of the causes of death in these populations would help identify the most important targets for new therapies and help direct future investigation to improve survival. we sought to determine the causes and circumstances of death in a contemporary cohort of ahrf patients, and assess whether causes of death differed among patients with and without ards. we performed a retrospective cohort study of adult patients (aged ≥ 18 years) hospitalized at michigan medicine (january 1, 2016, to december 30, 2017) with ahrf who experienced in-hospital death. patients were identified via an electronic query tool of the electronic health record. as in prior studies [12, 13] , patients were defined as having ahrf when the following criteria were met: (1) receipt of invasive mechanical ventilation for at least 12 h (to exclude routine post-operative ventilation) in the medical, surgical, cardiac, trauma, or neurologic icu, and (2) a pao 2 /fio 2 ratio ≤ 300. lowtidal volume ventilation and protocols for daily awakening and spontaneous breathing trials for mechanically ventilated patients were employed [14] . demographics, comorbidities, highest sequential organ failure assessment (sofa) score within the first 24 h of ahrf onset, the lowest glasgow coma scale during the 72 h prior to death, and icu setting were also collected from the electronic health record through use of the electronic query tool. patients were classified as having ards by multiple physician adjudication as part of a prior study [12] . specifically, two critical-care trained physicians reviewed each ahrf hospitalization to determine whether patients met berlin criteria [15, 16] for ards: (1) new or worsening respiratory symptoms began within 1 week of a known clinical insult, (2) pao 2 /fio 2 ≤ 300 while receiving a positive end-expiratory pressure ≥ 5 cm h 2 o, (3) bilateral opacities on chest x-ray, (4) unlikely to be cardiogenic pulmonary edema, and (5) no other explanation for these findings. disagreement between physicians was resolved by a third physician in 21% of patients [12] . in addition to ards status, specific ahrf or ards risk factors were collected as part of the prior study (pneumonia, aspiration, non-pulmonary sepsis, non-cardiogenic shock, major trauma, major surgery, transfusion, pancreatitis, major burn, inhalation injury, vasculitis, pulmonary contusion, drowning, or none) [12] . patients transferred from another hospital were excluded as we were unable to reliably determine ards status, ahrf risk factors, or illness severity on presentation. patient data were reviewed by one of 5 internal medicine-trained physicians who did not participate in the adjudication of ards and were blinded to adjudicated ards status. data regarding causes and circumstances of death were collected using a structured abstraction form (appendix 1, online supplement). specifically, we abstracted presence and severity of sepsis, presence and severity of organ system dysfunction, withdrawal of life-sustaining treatments, and cause of death, as described further below. all data required for abstractions were available in the electronic medical record. to ensure consistency across reviewers, excellent inter-rater reliability was demonstrated on an initial test set of 10 charts (appendix 2, online supplement). for each patient, we assessed for sepsis and dysfunction of 8 organ systems during the 72 h prior to death. we classified sepsis and each organ dysfunction as severe or irreversible using definitions from a prior study by stapleton et al. [3] , with the following changes (table 1) . we changed the sepsis definition to align with sepsis-3 (appendix 3, online supplement). in addition, we changed the definition of severe pulmonary dysfunction from specific diagnoses (ards, bilobar pneumonia, bronchopleural fistula, or pulmonary embolism) to receipt of significant respiratory support (high-flow oxygen, invasive mechanical ventilation, or non-invasive positive-pressure ventilation) to better capture patients with severe pulmonary dysfunction. if a patient underwent withdrawal of life support before meeting any of the objective organ dysfunction criteria outlined in table 1 , abstractors were instructed to assign irreversible dysfunction to the organ system primarily responsible for the decision to withdraw life support in order to accurately capture cause of death (appendix 4, online supplement). finally, as in stapleton et al., we defined multi-organ failure as organ dysfunction in at least two organ systems [3] . for each patient, we assessed (1) the primary organ system responsible for death, (2) whether death was related to progression of an initial ahrf risk factor or a complication after ahrf, and (3) whether withdrawal of life support occurred prior to death. the primary organ system responsible for death was defined as the organ dysfunction ( table 1 ) that most directly resulted in the patient's death or the decision to withdraw life support (appendix 5, online supplement). for patients with a primary cause of death other than pulmonary dysfunction, cause of death was further classified as being due to progression of an ahrf risk factor (e.g., sepsis, aspiration) or a complication that arose after ahrf onset (appendix 4, online supplement). withdrawal of life support was determined from clinical documentation of intent to withdraw life support and/or not escalate life support in the event of clinical decompensation and subsequent removal or nonescalation of life-sustaining interventions. we present data as numbers (proportions) or medians (inter-quartile range). we compared characteristics of ards vs non-ards patients using chi-square and kruskal-wallis tests and considered p < 0.05 to be significant. data analysis was completed in r. the study was deemed exempt by the institutional review board since all patients were deceased. we identified 385 adult patients with ahrf who died during a hospitalization in 2016-2017, of whom 127 (33%) had ards. the cohort was a median age of 63 years (55-73), 43% female, 82% white, and had a median sofa score of 12 (10) (11) (12) (13) (14) at ahrf onset. most patients were admitted to a medical icu (59%). patients had a median of 2 (1-3) risk factors for ahrf, most commonly non-cardiogenic shock (59% of patients), transfusion (41%), sepsis (39%), and pneumonia (37%, table 2 ). patients with ards had a higher median sofa score within the first 24 h of ahrf onset (14 vs 12; p = 0.002) and had higher prevalence of pneumonia (52% vs 30%; p < 0.001), aspiration (22% vs 12%; p = 0.01), and noncardiogenic shock (78% vs 50%; p < 0.001) compared to patients who did not meet the berlin definition of ards (table 2) . among the 385 patients, there were 1154 occurrences of organ system dysfunction in the 72 h prior to death (etable 1, online supplement). there were 101 (26.2%) patients that had multiple organ systems with irreversible dysfunction. the most common organ system dysfunctions were pulmonary (70%), neurologic (39%), and cardiac (29%). sepsis was present in 273 (71%) patients and 214 patients (56%) had multi-organ failure prior to death. however, irreversible pulmonary dysfunction was only present in 19 (5%) patients (table 3 )-7 (2% of all patients) with insupportable oxygenation or ventilation, and 12 patients with withdrawal of life support because of a poor pulmonary prognosis. patients with ards higher rates of sepsis (84% vs 64%; p < 0.001), pulmonary dysfunction (82% vs 64%; p < 0.001), irreversible pulmonary dysfunction (9% vs 3%; p = 0.004), and hematologic dysfunction (41% vs 26%; p = 0.003) compared to patients without ards. overall, the most common primary causes of death were sepsis (26%), pulmonary dysfunction (22%), and neurologic dysfunction (19%, fig. 1 ). among the 302 patients whose primary cause of death was not pulmonary dysfunction, 212 (55% of all patients) died primarily due to progression of an ahrf risk factor and 90 (23%) died primarily due to complications that arose after the onset of ahrf (table 4 ). cause of death by icu setting can be found in etable 2 in the supplementary appendix, with some variation in causes of death noted. ards patients were more likely to have a primary cause of death due to pulmonary dysfunction (28% vs 19%; p = 0.04) compared to patients without ards and less likely to have a primary cause of death from cardiac dysfunction (10% vs 19%; p = 0.03, table 4 ). in addition, ards patients were also more likely to die while receiving substantial respiratory support (82% vs 64%; p < 0.001). the majority of patients (85%) died after withdrawal of life support. the proportion of deaths that occurred after withdrawal of life support did not differ between patients with and without ards (87% vs 84%; p = 0.58, table 4 ). in this contemporary cohort of 385 adult patients with ahrf, the most common primary causes of death were sepsis, pulmonary dysfunction, and neurologic dysfunction. the majority of patients had multi-organ failure prior to death, most commonly due to sepsis. more than half of patients were receiving substantial respiratory support at the time of death and the vast majority of patients died after withdrawal of life support. sepsis and pulmonary dysfunction were the top two primary causes of death among both patients with and without ards. our study is consistent with prior reports indicating that sepsis is the leading cause of death among patients with respiratory failure. stapleton et al. found that sepsis was the most common cause of death in ards patients pulmonary°inability to liberate from mechanical ventilation, non-invasive ventilation, or heated high flow nasal cannula due to inadequate oxygenation or ventilation without aforementioned support insupportable oxygenation or ventilation defined as pao 2 < 40 mmhg on fio 2 -1.0 for > 2 h or respiratory acidosis with ph < 7.1 on maximum ventilator settings ∞ . option was given to apply irreversible dysfunction if care was withdrawn due to poor prognosis related to pulmonary organ system dysfunction. either cardiac output < 2.0 l/min/m 2 or documented cardiogenic shock or reversible ventricular fibrillation or asystole cardiogenic shock or arrhythmia not responsive to treatment. option was given to apply irreversible dysfunction if care was withdrawn due to poor prognosis related to cardiac organ system dysfunction. glasgow coma scale < 8 for ≥ 3 days meets brain death criteria. option was given to apply irreversible dysfunction if care was withdrawn due to poor prognosis related to neurologic organ system dysfunction. microvascular bleeding with either fibrinogen < 100 mg/dl, prothrombin time and partial thromboplastin time > 1.5 times control, or platelets < 60, 000/μl ongoing microvascular bleeding not surgically correctable with map < 65 mmhg not reversible with blood products. option was given to apply irreversible dysfunction if care was withdrawn due to poor prognosis related to hematologic organ system dysfunction. hemorrhage map < 65 mmhg for > 2 h (or requiring vasopressors) necessitating blood transfusions and excluding other causes of hypotension uncontrollable "surgical" bleeding from a non-microvascular source. option was given to apply irreversible dysfunction if care was withdrawn due to poor prognosis related to hemorrhage. hepatic bilirubin > 5.0 mg/dl and albumin < 2.0 g/dl and prothrombin time or partial thromboplastin time > 1.5 times control severe criteria plus hepatic encephalopathy and/or hepatorenal syndrome not responsive to treatment. option was given to apply irreversible dysfunction if care was withdrawn due to poor prognosis related to hepatic organ system dysfunction. gastrointestinal resectable ruptured or necrotic bowel, or pancreatitis causing shock (map < 65 mmhg for > 2 h or requiring vasopressors) inoperable ruptured or necrotic bowel or pancreatitis causing irreversible shock. option was given to apply irreversible dysfunction if care was withdrawn due to poor prognosis related to gastrointestinal organ system dysfunction. either creatinine > 5.0 mg/dl or requiring hemodialysis renal failure with acidosis, hyperkalemia, and/or hypercalcemia causing irreversible cardiac arrest. option was given to apply irreversible dysfunction if care was withdrawn due to poor prognosis related to renal organ system dysfunction. *definition of sepsis changed to reflect current practices. please see appendix 3, online supplement for previous definition of severe and irreversible sepsis syndrome°d efinition of severe pulmonary organ system dysfunction changed to reflect current practices. previously defined by stapleton et al. as "[acute respiratory distress syndrome], bilobar pneumonia, bronchopleural fistula, or pulmonary embolism documented by high-probability ventilation/perfusion scan or pulmonary angiogram" ∞ pao 2 arterial partial pressure of oxygen, fio 2 fraction of inspired oxygen † blood pressure parameters previously described by stapleton et al. as "hypotension" for irreversible hematologic organ system dysfunction or "systolic bp < 80" for severe hemorrhagic and gi organ system dysfunction changed to "map < 65 mmhg" treated in the 1990s [3] . despite increased attention to earlier identification and treatment of sepsis in the intervening decades [17, 18] , our study found that sepsis remained the most common cause of death in ahrf patients. this is consistent with recent studies showing that sepsis is the leading contributor to death among patients hospitalized for any cause [19] . sepsis was slightly more common in patients with ards than those without ards, which may reflect the higher rates of pneumonia and sepsis as risk factors for ards. however, it may also suggest that ards patients are at a heightened risk for secondary infections compared to patients without ards. these findings suggest that therapies targeting sepsis-induced multi-organ dysfunction may have the greatest impact on survival among ahrf patients. we found only small differences in the causes and circumstances of death among ahrf patients with and without ards. patients with ards were more likely to have a pulmonary dysfunction as the primary cause of death and more likely to die while receiving substantial pulmonary support than patients without ards. this indicates that the berlin ards definition identifies a subset of patients with ahrf who are more likely to die directly from respiratory failure and would benefit from therapies to enhance resolution of respiratory failure. however, the difference in rates of pulmonary dysfunction as the primary cause of death was relatively small among patients with and without ards. our study confirms the findings in prior studies indicating that insupportable oxygenation and/or ventilation is rare among patients with respiratory failure. one of the major findings of stapleton et al.'s study was the relatively low proportion of deaths due to insupportable ahrf acute hypoxemic respiratory failure, ards acute respiratory distress syndrome *sofa sequential organ failure assessment. represents the highest sofa score within the first 24 h of ahrf onset † other risk factors for ards/ahrf, each present in < 10% of the cohort, include major trauma (9%), major surgery (7%), pulmonary contusion (3%), pancreatitis (2%), major burn (1%), inhalation injury (1%), vasculitis (< 1%), or drowning (0%) oxygenation or ventilation, occurring in only 13-19% [3] . given the increased awareness and effort to treat sepsis in the period after this original study, we hypothesized that pulmonary dysfunction may be a more common primary cause of death in a contemporary ahrf cohort. however, we found that only 22% of patients had pulmonary dysfunction as the primary cause of death, and only a handful of patients (2%) had insupportable oxygenation and/or ventilation. there are several potential explanations for these findings. first, with more consistent use of lung protective ventilation, contemporary ahrf patients may be less likely to develop ventilator induced lung injury and progressive respiratory failure [20] . second, patients with severe ards may be more likely to be initiated on extra-corporeal membrane oxygen therapy prior to developing refractory pulmonary dysfunction [21] . finally, other strategies such as prone positioning may prevent refractory hypoxemia [22] . however, these hypotheses do not explain why a similar proportion of patients still ultimately die from respiratory failure despite not developing insupportable oxygenation and/or ventilation. while some patients may be supported through the initial phase of their respiratory failure, eventually life support is withdrawn when providers are unable to completely reverse their need for significant respiratory support. our study also highlights the increasing proportion of deaths that occur after a decision to withdraw or not escalate life support. stapleton et al. showed that from 1981 to 1998, the proportion of ards deaths that occurred after withdrawal of life support rose from 40 to 67% [3] . similar trends have been reported for all-cause critically ill patients during this time period [9] . our study suggests that this trend has continued, as we report that 85% of all deaths among ahrf are now occurring after a decision to withdraw or not escalate life support. our finding is also consistent with a recent study showing that 90% of deaths among critically ill patients treated in europe from 2015 to 2016 occurred in the setting of treatment limitations [23] . there are likely several explanations for why a growing proportion of deaths occur after withdraw of life support. stapleton et al. hypothesized that icu clinicians have earlier and more frequent goals-of-care discussions [3] , as is recommended in various clinical practice guidelines [17] . indeed, early multidisciplinary meetings with patients and families may lead to an earlier transition to palliative care among patients likely to die [24, 25] . more recently, there has been increased emphasis on family involvement in icu decision-making and treatment planning, for example, as recommended in the abcdef treatment bundle [26] . overall, the greater emphasis on family involvement in early shared decision making may contribute to earlier transitions to palliation among patients who ultimately die in the icu [27] . our study has several limitations. first, as a singlecenter study, it is possible that it may be lacking generalizability. however, we examined all deaths among patients with ahrf over a 2-year period who were treated in 5 distinct icus with different practice patterns. as such, we believe these findings are more broadly applicable. second, while we tried to harmonize our study definitions to those of stapleton et al. to facilitate cross-study comparisons, some changes had to be made to account for interval changes in definitions (e.g., sepsis) and treatments (e.g., high-flow oxygen). we limited deviations in study definitions to those deemed absolutely necessary to reflect the current state of icu practice. third, patients were classified as having undergone withdrawal of life support regardless of the time lag between withdrawal and death. for patients in whom only minutes elapsed between withdrawal of support and death, death may be more accurately representative of the cessation of medical interventions due to futility. however, our approach for determining rates of withdrawal and the rates of withdrawal we observed are consistent with prior reports [9] . fourth, given a high rate of withdrawal of life support, the most proximate cause of death is cessation of support. however, our methodology identifies which organ dysfunction or syndrome most directly led to that decision, thereby reflecting the primary pathophysiologic cause of death. fifth, there may be some subjectivity to assigning cause of death. however, we developed a standardized approach to assess causes of death based on the presence of irreversible and severe organ dysfunctions and confirmed excellent interrater reliability in identifying the primary cause of death among reviewers, which serves to strengthen the validity of our methodology. furthermore, chart review was performed by physicians only, as medical training may limit the subjectivity in identifying cause of death. in this contemporary cohort study of 385 patients who died after ahrf, the most common primary causes of death were sepsis and pulmonary dysfunction. few patients had insupportable oxygenation or ventilation, but most received substantial respiratory support in the 72 h prior to death. the vast majority of deaths occurred after a decision to withdraw or not escalate life support. patients with ards were more likely to have a primary cause of death of pulmonary dysfunction and to receive substantial respiratory support during the 72 h prior to death. supplementary information accompanies this paper at https://doi.org/10. 1186/s13054-020-03108-w. additional file 1: appendix 1. redcap abstraction tool. appendix 2. inter-rater reliability. appendix 3. previous definition of severe and irreversible sepsis syndrome. appendix 4. examples. appendix 5. determining cause of death by organ system. etable 1. total organ system dysfunction. etable 2. cause of death by icu setting. abbreviations ahrf: acute hypoxemic respiratory failure; ards: acute respiratory distress syndrome; icu: intensive care unit; sofa: sequential organ failure assessment the epidemiology of acute respiratory failure in critically iii patients epidemiology, patterns of care, and mortality for patients with acute respiratory distress syndrome in intensive care units in 50 countries causes and timing of death in patients with ards an official american thoracic society/european society of intensive care medicine/society of critical care medicine clinical practice guideline: mechanical ventilation in adult patients with acute respiratory distress syndrome ventilation with lower tidal volumes as compared with traditional tidal volumes for acute lung injury and the acute respiratory distress syndrome efficacy and safety of a paired sedation and ventilator weaning protocol for mechanically ventilated patients in intensive care (awakening and breathing controlled trial): a randomised controlled trial a binational multicenter pilot feasibility randomized controlled trial of early goal-directed mobilization in the icu early intensive care unit mobility therapy in the treatment of acute respiratory failure increasing incidence of withholding and withdrawal of life support from the critically ill palliative care in intensive care units: why, where, what, who, when, how the changing role of palliative care in the icu interobserver reliability of the berlin ards definition and strategies to improve the reliability of ards diagnosis differences between patients in whom physicians agree and disagree about the diagnosis of acute respiratory distress syndrome evaluating delivery of low tidal volume ventilation in six icus using electronic health record data acute respiratory distress syndrome: the berlin definition the berlin definition of ards: an expanded rationale, justification, and supplementary material surviving sepsis campaign: international guidelines for management of sepsis and septic shock the third international consensus definitions for sepsis and septic shock (sepsis-3) hospital deaths in patients with sepsis from 2 independent cohorts comparison of the berlin definition for acute respiratory distress syndrome with autopsy extracorporeal life support organization registry report 2012 prone positioning in severe acute respiratory distress syndrome changes in end-of-life practices in european intensive care units from 1999 to an intensive communication intervention for the critically ill impact of a proactive approach to improve end-of-life care in a medical icu the abcdef bundle in critical care limitation of life-sustaining care in the critically ill: a systematic review of the literature publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations we thank daniel molling, ms, of va ccmr, for his careful data analysis. authors' contributions sk made substantial contributions to the conception and design of the work, acquisition, analysis, and interpretation of the data, and drafted and substantively revised the work. ys, hm, tb, aw, ic, dc, and jm made substantial contributions to the acquisition of data. hp made substantial contributions to the conception and design of the work, analysis, and interpretation of the data and drafted and substantively revised the work. ms made substantial contributions to the conception and design of the work, acquisition, analysis, and interpretation of the data and drafted and substantively revised the work. all authors have approved the submitted version and have agreed both to be personally accountable for the authors' own contributions and to ensure that questions related to the accuracy or integrity of any part of the work, even ones in which the author was not personally involved, are appropriately investigated, resolved, and the resolution documented in the literature. dr. prescott was supported in part by k08 gm115859 from the nih/nigms. the datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. this study was approved by the university of michigan institutional review board. this study does not involve living individuals and therefore consent was waived. competing interests this material is the result of work supported with resources and use of facilities at the ann arbor va medical center. this manuscript does not represent the views of the department of veterans affairs or the us government. the authors declare that they have no competing interests. key: cord-343042-9mue4eiv authors: bertozzi, giuseppe; maglietta, francesca; baldari, benedetta; besi, livia; torsello, alessandra; di gioia, cira rosaria tiziana; sessa, francesco; aromatario, mariarosaria; cipolloni, luigi title: mistrial or misdiagnosis: the importance of autopsy and histopathological examination in cases of sudden infant bronchiolitis-related death date: 2020-05-27 journal: front pediatr doi: 10.3389/fped.2020.00229 sha: doc_id: 343042 cord_uid: 9mue4eiv pediatrics, among all the branches of medicine, is a sector not particularly affected by a high number of claims. nevertheless, the economic value of the compensation is significantly high, for example, in cases of children who suffered multiple disabilities following perinatal lesions with a long life expectancy. in italy, most of the claims for compensation concern surgical pathologies and infections. among these latter, the dominant role is taken by respiratory tract infections. in this context, the purpose of this manuscript is to present a case series of infant deaths in different emergency-related facilities (ambulances, emergency rooms) denounced by relatives. following these complaints, the autopsy was performed, and subsequent histological examinations revealed the presence of typical and pathognomonic histological findings of acute viral bronchiolitis, whose morphological appearance is poorly reported in the literature. the analysis of these cases made it possible to highlight the following conclusions: the main problems in diagnosing sudden death causes, especially in childhood, are the rapidity of death and the scarce correlation between the preexistent diseases and of the cause of death itself. for all these reasons, the autopsy, either clinical or medicolegal, is mandatory in cases of sudden unexpected infant death to manage claim requests because only the histological examinations performed on samples collected during the autopsy can reveal the real cause of death. pediatrics, among the branches of medicine, is not particularly affected by a high number of claims. nevertheless, the economic value of compensation is significantly high, for example, in cases of children who suffered multiple disabilities following perinatal lesions with a long life expectancy. according to carroll and buddenbaum (1) and moriani et al. (2) , examining the data collected by an association of several american insurance companies (physician insurers association of america), it was found that only 28% of the cases resulted in compensation; among these, in cases where no damages had been paid, the average cost per the only defense was $28,779, while it results in $67,502 for paid claims (3) . the medical diagnoses, most commonly involved in civilian pediatric trials in the united states, were brain damage (average damages $440,379) and meningitis ($437,423) . respiratory problems in newborns account for $270,607. in italy, data from the insurance company carige spa, in the period 2005-2012, excluding neonatology, highlighted how the main pathologies for which legal action was referred to surgery (gastrointestinal and testicular) and infections (more respiratory ones). moreover, the data on litigation have also shown a different stratification of the number of requests for compensation, which were greater in the north and minors in central italy, mostly involving the public health system (4) . according to the italian study performed during the 2005-2010 period, neonatology has also shown an overlapping geographical stratification, with the greatest interest for the public sector. the claims for damages following death, concerning the neonatal intensive care unit (nicu), mainly concern respiratory diseases (30.7% of cases) (5, 6) . both in the pediatric population in general, but especially in the neonatological one, the dominant role is taken by the respiratory tract infections. in this context, the purpose of these case series is to demonstrate how the identification of the correct cause of death in the sudden unexpected infant deaths (suids) allowed evaluating the absence of medical liability. particularly, the definition of gold standard methods in similar cases could be considered very important to avoid the compensation in unjustified claim requests. all procedures performed in the study were in accordance with the ethical standards of the institution and with the 1964 helsinki declaration and its later amendments or comparable ethical standards. written informed consent was obtained from the first-degree relatives. a 10-month-old male infant died during emergency medical services (ems) transport to the hospital. when parents had been asked for any modification in their child habits, a mild "rhinitis" for a few days was told. for this reason, they went to their trusted pediatrician 2 days earlier, who suggested saline nasal rinses and a short turn check. the medical examiner documented no relevant external sign to explain death. therefore, the parents sued the pediatrician for both penal and civil liability. during the forensic autopsy, the macroscopic examination was unremarkable except for mild edema affecting both lungs. on the contrary, histological examination showed in both lungs a diffuse transmural inflammation in the bronchiolar wall. other tissue sections showed chronic inflammation, and bronchiolar wall fibrosis primarily restricted to bronchioles (figure 1 ). an ambulance was called for a 9-month-old female infant, who lived in a nomad camp; her parents referred that suddenly she did not respond to external stimuli. relatives did not refer to any symptoms neither clinical signs in the previous days. during the resuscitation maneuvers in place, the infant died. thus, the prosecutor ordered the autopsy for alleged medical liability; her parents demanded the civil compensation to the local health insurance, thinking that during the ambulance transportation, there was medical responsibility. the macroscopic examination, both at the external corpse and internal organs, only showed severe pulmonary edema. the histology was characterized by lymphocytic infiltration of the bronchioles (figure 2 ). an 18-month-old female infant was admitted to the emergency room of a pediatric hospital for severe cough and pharyngitis; she died after a few hours. symptoms onset occurred the day before hospitalization. she was a preterm infant (29.3 weeks, birth weight 1,400 g) who suffered from severe respiratory distress at birth (apgar score 1 ′ = 4) and needed a long period of hospitalization. after discharge, she showed neurodevelopmental impairment; moreover, a month before death, she suffered from many viral infective pathologies such as influenza and mononucleosis: all pathologies were successfully treated with standard pharmacological therapies. in this case, the judicial authority disposed of the forensic examination, suspecting medical liability to clarify penal and civil aspects: indeed, at the time of death, a claim for damages has been made to the hospital by the family of the patient. the autopsy showed congestion of tracheal and bronchial mucosa. at histological examination, focal edema and diffuse congestion of both lungs, acute emphysema, and peribronchial and intrabronchial wall leukocyte infiltrates were found; the same results involved nearby septal vessels (figure 3 ). the method used in these cases, as in all cases of sudden death, consists of a rigorous and multidisciplinary methodological approach (7, 8) : -anamnestic collection and clinical findings: the clinical symptomatology presented by the subject before his death or in close chronological concurrence, clinical history of the case, and previous medical records; -anatomical evidence: a macroscopic examination of all organs, their weight, consistency, and color at the autopsy, such as the appearance of fluids (blood, urine, vitreous humor). -histomorphological examination h&e of the organ samples to study any alteration due to pathological condition; -immunohistochemistry: to evaluate, in particular, the presence and the location of the main white blood cells via antibody anti-leukocyte common antigen (cd45). the adherence to this diagnostic procedure not only allows you to check and evaluate a larger quantity of data but allows a complete evaluation on all fronts, from the study of which can confirm the suspicion and/or an unexpected result but crucial for investigations. in the discussed cases, following both the autoptic and especially the microscopic examination, the cause of death was identified in all investigated cases: a rapidly progressive acute bronchiolitis was ascertained. these findings allowed exonerating doctors from any penal liability. the bronchiolitis was defined in 2006, from a collaboration between the american academy of pediatrics (aap) and the european respiratory society (ers), as "a constellation of clinical symptoms and signs including a viral upper respiratory prodrome followed by increased respiratory effort and wheezing in children <2 years of age" (9) . acute viral bronchiolitis (avb) is a lower respiratory tract infective disorder, typically affecting infants <2 years old (90% of cases). respiratory syncytial virus (rsv) is involved in up to 70% of cases, followed by rhinovirus (up to 25%); the remaining cases are related to coronavirus, adenovirus, influenza, and parainfluenza virus, and human metapneumovirus. coinfections are common (10) . however, the seasonality of bronchiolitis, generally more frequently found during the winter months, coincides with the seasonal pattern of rsv diffusion (11) . the infection starts in the upper respiratory tract, spreading to the lower airways in a few days. the bronchiolar damage is determined by the direct action of the virus on the epithelium of the same tract; alternatively, it was indirectly immunemediated, and it was characterized by a peribronchial infiltration of white blood cell types, mainly mononuclear cells, with edema of the submucosa and adventitia (9) . the pathophysiological continuation is caused by a mixture of edema, increased production of mucus, and progressive damage of the epithelium even to necrosis, which determines obstruction of the airflow, entrapment of distal air, atelectasis, and alteration of the ventilation/perfusion. the results are hypoxemia and increased respiratory work, which in turn worsens hypoxemia (9) . the most important extrapulmonary symptoms involve the brain (apnea, epileptic status) and heart (ventricular tachycardia, ventricular fibrillation, cardiogenic shock, complete heart block, and pericardial tamponade) and are common in children with severe infections (12) . the most dreadful complications of bronchiolitis are central apnea, a respiratory pause with bradycardia, cyanosis, pallor, and hypotonia that often requires hospitalization (13) . bronchiolitis represents a disease with high morbidity but low mortality. death from respiratory failure in bronchiolitis is rare and varies from deaths from 2.9:100,000 in the uk to 5.3:100,000 in the us, for children under 12 months, with a relationship that goes hand in hand, reducing itself to the improvement of good intensive practices (9, (14) (15) (16) . in all these cases, in the absence of the clinical-anamnestic data that can guide the clinical diagnosis, the external examination data and the autopsy macroscopic data could point toward a diagnosis of suid. in the case of özdemir et al. (17) , in fact, on the totality of the cases of malpractice claims, 57.5% of the children had died and 59.3% were subjected to autopsy. in these cases, the causes of death reported before and after the autopsies were different in 68%, and the medical staff was found to be responsible for 46.1% of the claims. therefore, the determination of the exact cause of death assumes fundamental importance to ascertain the causal link of any conduct of both health facilities and individual professionals in determining death (18) (19) (20) . this allows not only a measurement of the quality of care provided by promoting public trust for the health system but also as a measure of clinical governance; moreover, it is possible to better manage the medicolegal disputes as a guarantee of ascertaining the truth. indeed, in italy, the data on the frequency of adverse events (aes), preventable adverse events (paes), and negligent adverse events (naes) are available; nevertheless, data about malpractice claims are not available both under the penal and civil points of view. furthermore, the epidemiological purposes cannot be forgotten, considering that it is the only reliable method of data collection. indeed, a complete methodological approach, integrating clinical data, autopsy, and histological findings could be considered the best way to solve similar cases. in fact, in the reported case studies, histopathologic diagnostics identified pathognomonic signs of acute bronchiolitis characterized by edema, congestion, leukocytic infiltration in the bronchiolar wall, leukocytes in the peribronchial interstitial pulmonary space, allowing the identification of the exact cause of death. therefore, these pieces of evidence have allowed excluding the medical responsibility in the reported cases, demonstrating that there are events not related to the supplied health care. the analysis of the presented cases shows that the autopsy is mandatory in suid occurrence, in which the absence of anamnestic data and/or acute clinical signs does not allow to identify the cause of death. hypothesizing medical negligence in each case, the autopsy was performed following the judicial appointment after the relative's complaint. the subsequent histological examinations revealed the presence of typical and pathognomonic histological findings of avb, whose morphological appearance is poorly described in the literature. only the postmortem examinations have allowed excluding medical liability and therefore the compensation for damage. in light of these findings, it could be considered essential an accurate evaluation of similar cases, collecting all data to avoid compensation in unjustified claims made against the hospital. in this way, it is possible to contain the hospital costs related to this kind of accident. for all these reasons, the autopsy combined with the subsequent examination represents a gold standard method to identify the absence of the hospital's responsibility in suid cases. all datasets generated for this study are included in the article/supplementary material. written informed consent was obtained from the first-degree relatives for the publication of this case report. gb, fm, ma, and lc contributed to the conception of the study and wrote the manuscript. fs, bb, lb, at, and cd contributed significantly to literature review and manuscript preparation. gb, fm, fs, bb, lb, at, cd, ma, and lc helped perform the analysis with constructive discussions and approved the final version. malpractice claims involving pediatricians: epidemiology and etiology suicide by sharp instruments: a case of harakiri medical diagnoses commonly associated with pediatric malpractice lawsuits in the united states pediatric claims in italy during a 8-years survey neonatal malpractice claims in italy: how big is the problem and which are the causes? post-mortem magnetic resonance foetal imaging: a study of morphological correlation with conventional autopsy and histopathological findings a multidisciplinary approach is mandatory to solve complex crimes: a case report italian mafia: a focus on apulia mafia with a literature review acute bronchiolitis in infants, a review prospective multicenter study of viral etiology and hospital length of stay in children with severe bronchiolitis effect of prematurity on respiratory syncytial virus hospital resource use and outcomes extrapulmonary manifestations of severe respiratory syncytial virus infection -a systematic review frequency of apnea and respiratory viruses in infants with bronchiolitis clinical predictors of radiographic abnormalities among infants with bronchiolitis in a paediatric emergency department variation in the management of infants hospitalized for bronchiolitis persists after the american academy of pediatrics bronchiolitis guidelines risk factors for bronchiolitis-associated deaths among infants in the united states medical malpractice claims involving children medical records quality as prevention tool for healthcare-associated infections (hais) related litigation: a case series personalised healthcare: the dima clinical model healthcare-associated infections: not only a clinical burden, but a forensic point of view key: cord-023355-yi2bh0js authors: o'brien, mauria a.; kirby, rebecca title: apoptosis: a review of pro‐apoptotic and anti‐apoptotic pathways and dysregulation in disease date: 2008-12-18 journal: j vet emerg crit care (san antonio) doi: 10.1111/j.1476-4431.2008.00363.x sha: doc_id: 23355 cord_uid: yi2bh0js objective – to review the human and veterinary literature on the biology of apoptosis in health and disease. data sources – data were examined from the human and veterinary literature identified through pubmed and references listed in appropriate articles pertaining to apoptosis. human data synthesis – the role of apoptosis in health and disease is a rapidly growing area of research in human medicine. apoptosis has been identified as a component of human autoimmune diseases, alzheimer's disease, cancer, and sepsis. veterinary data synthesis – research data available from the veterinary literature pertaining to apoptosis and its role in diseases of small animal species is still in its infancy. the majority of veterinary studies focus on oncologic therapy. most of the basic science and human clinical research studies use human blood and tissue samples and murine models. the results from these studies may be applicable to small animal species. conclusions – apoptosis is the complex physiologic process of programmed cell death. the pathophysiology of apoptosis and disease is only now being closely evaluated in human medicine. knowledge of the physiologic mechanisms by which tissues regulate their size and composition is leading researchers to investigate the role of apoptosis in human diseases such as cancer, autoimmune disease and sepsis. because it is a multifaceted process, apoptosis is difficult to target or manipulate therapeutically. future studies may reveal methods to regulate or manipulate apoptosis and improve patient outcome. all tissues must be able to tightly control cell numbers and tissue size and to protect themselves from rogue cells that threaten homeostasis. in the early 1970s, kerr et al, 1 observed a single-cell-death phenomenon that occurred in the dying cells of healthy tissues, as well as in cells associated with teratogenesis, neoplasia, tumor regression, atrophy, and involution. the term apoptosis, from greek origins (apo 5 for, ptosis 5 falling), was chosen to describe the cellular process of programmed cell death. [1] [2] [3] apoptosis is a tightly regulated intracellular program in which cells destined to die activate enzymes that degrade the cell's dna and nuclear and cytoplasmic proteins. 4 programmed cell death eliminates unwanted cells or potentially reactive cell lines either before or after maturation. this process is vital to fetal and embryonic development and to tissue remodeling. 3 cell populations that normally have a high rate of proliferation, such as the intestinal epithelium, depend upon apoptosis to maintain the necessary number of cells. 5 the number of activated immune cells must be controlled to contain the inflammatory response. 6, 7 hormone-dependent apoptosis occurs during estrus and causes prostatic atrophy after castration. 8 kerr et al 1 observed that apoptotic cells share many morphologic features distinct from those in necrotic cells. cells undergoing apoptosis exhibit 1 or more of the following: cell shrinkage, chromatin condensation and nucleosomal fragmentation, and bubbling of the plasma membrane (blebbing). biochemical features of apoptosis include dna fragmentation, protein cleavage at specific locations, increased mitochondrial membrane permeability, and the appearance of phosphatidylserine on the cell membrane surface. 3, 9 there is an increase in mitochondrial permeability leading to the release of pro-apoptotic proteins and subsequent formation of apoptotic bodies. the resulting membrane-bound apoptotic bodies are consumed by neighboring cells or by macrophages. apoptosis is a single-cell event, and does not induce an inflammatory reaction. apoptosis must be distinguished from necrosis, which is also a form of cellular death. in contrast to apoptosis, necrosis is not a genetically programmed function, it affects groups of neighboring cells, and produces an inflammatory response. 10 the death of a cell by necrosis leads to the release of alarm signal molecules that stimulate 1 or more pattern-recognition receptors on macrophages, dendritic cells, and natural killer cells. 11 the presence of necrotic cells in a tissue is frequently interpreted by the immune system as dangerous and therefore acts as a signal to initiate an immune response. 11 unlike apoptosis, with necrosis there is cellular swelling with loss of cell membrane integrity, organelle swelling, and lysosomal leakage. the degradation of dna is random and lysed cells are ingested by macrophages. 12 whether a cell survives or dies by apoptosis is determined by the balance between pro-apoptotic (stress or death) signals and anti-apoptotic (mitogenic or survival) signals within and around the cell (see tables 1 and 2 ). cell injury via oxygen deprivation, heat stress, chemical agents, radiation, infectious agents, genetic derangements, nutritional imbalances, immunologic reactions (eg, anaphylaxis), and other types of severe cell stress will initiate the pro-apoptotic pathways. 4 dysregulation of apoptosis can affect the equilibrium between cell growth and cell death, resulting in organ dysfunction. apoptosis in health is a finely balanced process. too much or too little apoptosis contributes to disease. apoptosis of infected cells is part of the host's defense mechanism. some viruses and bacteria, however, have developed the ability to inhibit the infected cell's apoptotic mechanisms and protect their environment. 13 inhibition of apoptosis is linked to uncontrolled cell growth and the formation of many types of cancer. in humans, excessive apoptosis is linked to stroke and alzheimer's disease. 14 the activation or restoration of apoptosis is emerging as a key strategy for treatment of cancer and other diseases. [15] [16] [17] [18] [19] [20] our aim is to provide a basic review of the literature regarding the mechanisms and regulation of apoptosis. extracellular ligand-directed or intracellular stressinduced stimuli can activate this highly regulated process. caspases play a central role by initiating and executing the intracellular cascade of events that result in protein and nucleic acid cleavage, and ultimately, cell death. many of the key apoptotic proteins have been identified, however there is still much to learn regarding the molecular mechanisms of action or activation of these proteins. knowledge of the pro-apoptotic and anti-apoptotic cell pathways is important to understanding the mechanisms of many life-altering diseases in humans and animals and realizing the potential for novel therapeutics. apoptosis can be genetically encoded or can occur in response to cellular or external stimuli. there are 3 features that characterize apoptosis: protein cleavage or hydrolysis, breakdown of nuclear dna, and recognition of the apoptotic cell by phagocytic cells. 4 the cleavage of proteins primarily occurs with the activation of a family of cysteine proteases called caspases (cysteine aspartate-specific proteases). 8 caspases are synthesized in an inactive form and activated by specific initiation mechanisms. 10 programmed cell death can also result from caspase-independent mechanisms triggered by cell membrane receptor-ligand binding or damage to cell organelles. [21] [22] [23] [24] [25] [26] initiation of caspase cascades there are 3 known pathways that initiate the activation of caspase cascades and the programmed death of a cell. the route utilized is dependent on the initial death signal, the cell type involved, and the balance between pro-apoptotic and anti-apoptotic signals. 10 one initiating path may lead to another with cross-talk between them possible. two of the pathways, the death receptor (dr) (extrinsic) and mitochondrial (intrinsic), have been detailed in the literature. 2, 14, [27] [28] [29] the third is an bcl-2, b-cell lymphoma 2; bcl-xl, bcl-2-associated protein xl; bax, bcl-2-associated protein x; bak, bcl-2-associated protein k; c-flip, flice-like inhibitory protein; nf-kb, nuclear factor-kb; ikb, inhibitory-kb; iaps, inibitor of apoptosis proteins; xiap, x-linked inhibitor of apoptosis protein; jak, janus kinase; stat, signal transducers and activators of transcription; mapk, mitogen-activated protein kinase; pkr, protein kinase r; cdk, cyclindependent kinase. intrinsic pathway involving the endoplasmic reticulum (er) and is the least understood. [30] [31] [32] [33] [34] this pathway is believed to be a pathologically relevant form of apoptosis occurring in response to cellular stress. 10, 35 extrinsic (dr) pathway: the extrinsic pathway (see figure 1 ) begins with pro-apoptotic receptors on the cell's surface activated by a pro-apoptotic molecule or ligands specific for that receptor. these cell drs belong to the tumor necrosis factor (tnf) receptor superfamily, with the fas receptor and tnfr1 as the most intensely studied members. 15 fas is present on a variety of cell types including activated b cells and t cells. 36 ligands that activate pro-apoptotic receptors include the fas ligand (fasl) and tnf-a [37] [38] [39] [40] [41] [42] (table 1 ). fasl is expressed by a variety of cell types, including activated t cells and natural killer cells. 36 tnf-a is produced predominantly by activated monocyte/macrophages and lymphocytes. 26 the intracellular portion of the dr is known as the death domain (dd). once 3 or more dr-ligand complexes bunch, their dds are brought into close proximity and a binding site for an adaptor protein is formed. the adaptor protein is specific for that receptor (eg, fas-associated dd [fadd] or tnf receptor-associated dd [tradd] ). this complex of ligand-receptor-adaptor protein is called the death-inducing signaling complex (disc), leading to the recruitment and assembly of initiator caspases 8 and 10. 43-46 these caspases can now undergo self-processing and release active caspase enzyme molecules into the cytosol. here, they activate the effector caspases 3, 6, and 7. 15, 47, 48 figure 1 illustrates the sequence of events that trigger the extrinsic pathway. the extrinsic or death receptor (dr) pathway. pro-apoptotic ligands, death signals, and fas bind to fas or tnfrs. the intracellular portion of the dr is known as the death domain (dd). bunching of the receptor-ligand complexes groups their dds and a binding site for an adaptor protein is formed. this ligand-receptor-adaptor protein complex is called the death-inducing signaling complex (disc). it recruits and assembles initiator caspase-8 that releases active caspase enzyme molecules into the cytosol. here, they activate the effector caspases-3 and -7, resulting in nuclear protein cleavage and the initiation of apoptosis. fasl, fas ligand; tnfr tumor necrosis factor receptor; fadd, fas-associated death domain; tradd, tnf-associated death domain; c-flip, flicelike inhibitory protein; disc, death-inducing signaling complex. intrinsic mitochondrial pathway: the intrinsic mitochondrial pathway (see figure 2 ) is initiated from within the cell in response to cellular stresses such as dna damage, radical oxygen species, radiation, hormone or growth-factor deprivation, chemotherapeutic agents, cytokines, and glucocorticoids. 30 initiation of this pathway eventually results in the release of pro-apoptotic proteins from the mitochondria that will activate caspase enzymes and trigger apoptosis. [49] [50] [51] [52] the success of the pathway in inducing apoptosis depends on the balance of activity between pro-apoptotic and anti-apoptotic members of the b-cell lymphoma-2 (bcl-2) superfamily of proteins (table 1) . bcl-2 superfamily of proteins derives its name as the second member of a range of proteins found in follicular lymphoma. 4, 53 all of the bcl-2 family members are present on the outer mitochondrial membranes as dimers where they control membrane permeability in ion channel fashion or through the creation of pores. 53 the permeability of the mitochondrial outer membrane determines whether or not there is release of the pro-apoptogenic substances from the mitochondria. this bcl-2 family of proteins is subdivided into 3 groups based on structural similarities and functional criteria. group i possess anti-apoptotic activity while groups ii and iii promote cell death. 2 the bcl-2 family the mitochondrial or intrinsic pathway. activation of the pro-apoptotic proteins bax and bak occurs through conversion of bid to tbid by caspase-8 or-10 and through activation of puma, noxa, or other bh3 initiator proteins when p53 is induced by dna damage. activated bax and bak oligomerize at the mitochondrial membrane and cause the release of several mitochondrial factors. cytochrome c combines with apaf-1 and procaspase-9 forming an apoptosome. also released from the mitochondria are smac/ diablo, proteins that inactivate iaps. activated caspase-9 then is able to activate caspase-3 or -7 allowing apoptosis to proceed. also released from the mitochondria are endog and aif that stimulate apoptosis independent of caspases. bcl-2 and bcl-xl block the activation of bax and bak. bcl-2, b-cell lymphoma-2; iap, inhibitor of apoptosis protein; apaf-1, apoptosis-activating factor-1; smac, second mitochondrial-derived activator of caspases; diablo, director inhibitor of apoptosis-binding protein with low pi; bh3, bclhomology-3; tbid, truncated bid; endog, endonuclease g; aif, apoptosis-inducing factor; bax, bcl-2-associated protein x; bak, bcl-2associated protein k. share 1 or more of 4 characteristic domains of homology crucial for function. the anti-apoptotic bcl-2 family proteins, such as bcl-2 and bcl-xl, contain all 4 domains and exert their control of mitochondrial permeability by stimulating adp/atp exchange, stabilizing the mitochondrial inner transmembrane potential, and preventing the opening of a permeability transition pore. 54 overexpression of bcl-2 and bcl-xl is known to be associated with a number of human malignancies 49, 55, 56 ( table 2 ). these proteins also act by inhibiting the action of the pro-apoptotic proteins, bax and bak. pro-apoptotic proteins of the bcl-2 family initiate apoptosis by blocking the anti-apoptotic activity of bcl-2 and bcl-xl by binding to their mitochondrial binding sites or by triggering the activation of pro-apoptotic bax/ bak. 57 a third type of pro-apoptotic activity is through the cytoplasmic protein, bid. this molecule is found in the cytoplasm in an inactive form. when cleaved by activated caspase-8 from the extrinsic pathway, bid (once activated, referred to as t-bid) causes a structural change to bax making it similar to the structure of the anti-apoptotic molecule, bcl-2, allowing bax to translocate to the mitochondria. [58] [59] [60] [61] this is but one method of cross-talk that occurs between the intrinsic and extrinsic pathways. each bcl-2 family member can interact with other bcl-2 members, so that large numbers of heterodimer combinations within a cell are possible. cells with more prodeath proteins are sensitive to death and cells with an excess of protective family members are usually apoptosis-resistant. 2 there are at least 3 current theories describing the exact mechanism by which the bcl-2 pro-apoptotic proteins lead to increased mitochondrial permeability. the first theory describes the insertion of bcl-2 proteins into the mitochondrial membrane and directly forming a channel. 2,62 a second theory explains the bcl-2 proapoptotic proteins interacting with other mitochondrial membrane proteins, possibly voltage-dependent anion channel, to form large pores. 63 the size of the voltage-dependent anion channel is too small for proteins to pass, so this model assumes that there is a conformation change with bcl-2 binding. 2 the third theory describes the bcl-2 proteins modulating the mitochondrial proteins resulting in the formation of a permeability transition pore and loss of membrane potential, organelle swelling, and loss of cytochrome c from the pore. 64, 65 once the permeability of the membrane has been compromised, cytochrome c is released and combines with a cytosolic molecule called apoptosis activating factor-1 (apaf-1). cytochrome c and apaf-1 combine with procaspase-9 for activation of this caspase ( figure 2 ). the binding of these 3 substances forms an apoptosome, which then activates procaspase-3. alternate substances can initiate the intrinsic and extrinsic pathways (table 1) . phosphoprotein p53 is a transcription factor that regulates the cell cycle and functions as a cell stress sensor molecule capable of inducing the intrinsic mitochondrial pathway. factors that damage dna, such as ionizing radiation, genotoxic drugs, and free radicals, activate p53. activated p53 promotes apoptosis primarily through its ability to suppress the transcription of anti-apoptotic factors like bcl-2 or induce the manufacture of pro-apoptotic factors like bax, insulin growth factor binding protein-3 and upregulation of the fas receptor. 10, 66 bcl-xl can inhibit p53, 64 and inactivation or loss of p53 is a common abnormality of many human cancers. 66, 67 one veterinary study showed an increased risk in cats, diagnosed with oral squamous cell carcinoma and exposed to secondhand tobacco smoke, to overexpress p53. in human cancer, p53 is the most commonly disrupted gene and is also the most frequently mutated gene in human oral cancer. 68 intrinsic er pathway: the third and least understood pathway is referred to as the endoplasmic reticulum or er pathway. 30 it involves caspase-12 and is said to be able to function independently of the mitochondria. 32 cellular stresses such as hypoxia, glucose starvation, disturbances in calcium homeostasis, and exposure to free radicals injure the er, resulting in the unfolding of proteins and reduction in protein synthesis. in normal cells, an adaptor protein, tnf receptorassociated factor 2 (traf2), is bound to procaspase-12, rendering it inactive. stress of the er leads to the dissociation of traf2, activation of caspase-12. 34, 35, 69 once activated, caspase-12 cleaves procaspase-9, which then cleaves procaspase-3. this mechanism is independent of the mitochondria although there is evidence that caspase 12 can cause the release of cytochrome c from the mitochondria and thus stimulate the intrinsic pathway 35 (table 1) . there are other mechanisms of cross-talk between the intrinsic pathways that initiate apoptosis. ito et al, 69 demonstrated that er stress can cause activation of a positive apoptosis regulator, c-abl tyrosine kinase, known for its tumorogenic characteristics and results in the release of cytochrome c from the mitochondria. 25, 69 in addition, the mitochondrial pathway pro-apoptotic bcl-2 family protein, bak, has been implicated in causing er depletion of calcium, which can induce caspase-12 activation 70 (table 1) . the central executioners of apoptosis are part of a large protein family known as the caspases. 2, 71 to date, 14 caspases have been identified. specific caspases are found in relatively large amounts as inactive precursors called procaspases within the cytoplasm. procaspases can be activated by 1 of 3 methods: (1) exposure to another activated caspase, (2) autocatalysis, or (3) association with an activator protein, such as caspase-9, apaf-1, and cytochrome c. 2 the caspases involved in apoptosis are subdivided into initiator caspases (2, 8, 9, 10) and effector (or executioner) caspases (3, 6, 7) . the initiator caspases are activated by adaptor-mediated self-cleavage. the specific interaction of caspases and activator protein promotes the formation of a multimeric complex that is necessary to bring 2 caspase precursors together to activate each other and produce an active tetramer. 72, 73 this caspase cascade strategy of activation is used by the initiator caspases to cleave and activate the effector caspases. 65 when activated, the effector caspases selectively cleave a restricted set of target proteins that follow an aspartate residue. in most cases, this results in inactivation of the target protein. however, they can also activate proteins, either directly by cleaving off a negative regulatory domain, or indirectly, by inactivating a regulatory subunit. 2, 74 effector caspases cause cytoskeletal filament aggregation, clumping of ribosomal particles and rearrangement of rough er to form a series of concentric whorls as seen on electron microscopy (table 1) . 10, 65 caspases are also responsible for cleaving nuclear lamins required for nuclear shrinking and budding, and for loss of cellular shape and membrane blebbing. 2, 75, 76 caspases will activate caspase-activated dnaases (cad) that breakdown nuclear dna, the second feature of apoptosis. cad exists in an inactive form (icad) in the nucleus when it is bound to a subunit. once the effector caspase-3 is activated it migrates to the nucleus and cleaves the inhibitory subunit thus activating cad. 77 cad is the nuclease responsible for breaking down the dna into 50-300-kb pieces that are then cleaved into 180-200-bp fragments by endonucleases. it is these fragments that compose the dna ladders visualized by agarose gel electrophoresis, a biochemical hallmark of apoptosis. 10 alternate pathways granzyme b: there is an accessory method of triggering apoptosis by the serine protease granzyme b, a lymphocyte granular enzyme expressed by activated cytotoxic t lymphocytes and natural killer cells. granzyme b will bind to its cell surface receptor, an insulin-like growth factor ii receptor, causing endocytosis of the protease. it remains in the endocytic vesicle until stimulation by an activated cytotoxic t cell. 78 activation of almost all of the activator and effector cas-pases can occur through the action of the granzyme b pathway. a number of caspase sensitive targets, as well as other unique proteins not normally cleaved by caspases, can be cleaved directly by granzyme b. 74, 79 this mechanism is dependent on mitochondrial disruption, as overexpression of anti-apoptotic bcl-2 will halt this process. granzyme-mediated apoptosis is integral to the immune surveillance machinery that rid the body of virally infected or malignantly transformed cells 26 (table 1) . mitochondrial factors: increased mitochondrial outer membrane permeability can result in the release of mitochondrial pro-death substances in addition to cytochrome c, such as apoptosis inducing factor, 67 smac/ diablo 80-82 (second mitochondrial-derived activator of caspases/direct inhibitor of apoptosis-binding protein with low pi), endonuclease g, htra2/omi, and several procaspases 83 (eg, procaspase-2, -3, and -9). smac/diablo and htra2/omi bind to cytoplasmic inhibitor of apoptosis proteins (iaps), neutralizing their anti-apoptotic activity. the migration of apoptosis inducing factor from the mitochondria to the nucleus induces caspase-independent chromatin condensation and dna fragmentation. endonuclease g can also directly cause the break up of dna independent of caspases 84 (table 1 , figure 2 ). ceramide: the ceramide/sphingomyelin pathway can proceed with or without caspase interaction. ceramides are sphingolipid-signaling mediators involved in the regulation of differentiation, growth suppression, cell aging, stress responses, and apoptosis. various stresses, such as ultraviolet radiation, radical oxygen species, chemotherapeutics, il-1b, tnf-a, or fas activation, initiate this pathway. 21 sphingomyelinases are activated by binding to tnf receptor family dd, fadd, and tradd and cleave sphingomyelin, a member of the phospholipid bilayer, into ceramide. 22 ceramide can also be generated on lysosomes, er, and mitochondrial membranes, when stimulated by a variety of cytotoxic agents. [85] [86] [87] increased levels of ceramide initiate an apoptotic program involving mitochondrial membrane disruption (table 1) . ceramide-induced apoptosis is mediated through the mitochondria when ceramide accumulates in the mitochondrial membrane. elevation of ceramide and sphingosine result in increased mitochondrial membrane permeability, cyctochrome c release, and activation of caspase-9. 26 the susceptibility of a cell for ceramide-induced apoptosis is reliant on the cell's bax to bcl-2 ratio. 23 ceramide production, dna damage, and radical oxygen species also stimulate cellular lysosomal release of cathepsins, which leads to release of mitochondrial factors, activation of procaspase-9 and -3, and activation of bid. 24, 25 inhibitors of apoptosis the most well-known anti-apoptotic factors are members of the bcl-2 family (table 2 ). there is a dynamic equilibrium between the anti-apoptotic members and pro-apoptotic members of the bcl-2 family. additional inhibitors include iaps, flice-like inhibitory protein (c-flip) and nuclear factor-kb (nf-kb). iaps: a failsafe inhibitory mechanism exists in the intrinsic pathway. pro-apoptotic activity is counterbalanced by a family of at least 8 proteins, known as iaps (eg, survivin and xiap). the iaps compromise the effector phase of apoptosis through blocking or inactivating caspases [88] [89] [90] [91] [92] [93] (see table 2 ). they have been found associated with the activated tnf receptors 88 where they block the activation of caspase-8 and are upregulated by nf-kb activation. 89, 90 iaps also act downstream of the mitochondrial release of cytochrome c to prevent activation of caspase-9. 94,95 caspase-9 uses a peptide from one of its end subunits to attract an iap family molecule. this binding allows caspase-9 to remain dormant even though the initial steps for activation have taken place. 96, 97 the overexpression of iaps is associated with a drug-resistant phenotype of cancer cells. 98 c-flip: c-flip is a protein-deficient caspase homolog. this inhibitor prevents both the binding of caspase-8 to various drs and its activation. 99 the ratio of c-flip to caspase-8 is critical for the assembly of the disc. formation of the disc recruits c-flip to bind to its target molecule. 57 this works as a built-in safety system and alludes to the integral balance between proapoptotic and anti-apoptotic mediators. upregulation of c-flip has been associated with diverse hematologic cancer cell lines. 100, 101 the sensitization of many cancer cells to death ligand-mediated apoptosis appears to be mediated by c-flip downregulation. 102, 103 there does not appear to be inactivation of caspase-9 mechanisms because c-flip does not prevent apoptosis induced by granzyme b or by chemotherapeutic drugs and irradiation. 104 nf-jb: nf-kb is a transcription factor and an antiapoptotic gene regulator composed of a p50/p65 heterodimer. the p65 subunit provides the gene regulatory function. nf-kb is kept quiescent in the cytoplasm as a dimer bound to its repressor, inhibitors of nf-kb (ikb) family. phosphorylation of ikb by upstream kinases frees nf-kb, which then translocates to the nucleus. 105 the p65 subunit is eventually released from the dna and binds to newly synthesized ikb a, which complexes with nf-kb. this complex translocates back into the cytoplasm. stimulation of nf-kb activation has been associated with accelerated growth, resistance to ap-optosis, and propensity to form metastases. inversely, inhibition of nf-kb activation produces an increase in apoptosis, indicating that the balance of cell viability versus cell death is maintained in some degree to nf-kb activation. 106 recognition of apoptotic cells the third characteristic of apoptosis is recognition of the dying cells by phagocytic cells. the apoptotic cell expresses markers on their membrane that are recognized by phagocytes. 4 through internal cellular signals phosphatidylserine, a phospholipid component, is shifted from the inner to the outer layer of the plasma membrane. 10 this allows for early recognition and removal of a dying cell without release of pro-inflammatory mediators, as occurs with necrosis. 4 control of whether the pro-apoptotic or anti-apoptotic pathway is chosen is subject to positive and negative genetic and environmental regulators. pro-apoptotic gene activation will lead to cell death while deactivation of the gene will block apoptotic pathways. genetic regulation can also be modified by exogenous stimuli from the cell's immediate environment. a cell destined or started on a death pathway can receive a survival signal that can save the cell from apoptosis. genetic regulators (mostly pro-apoptotic) include the c-myc gene family, the p53 tumor suppressor gene, drs and the caspase family. the c-myc gene is a member of the janus kinase or jak family, and is involved in both cell proliferation and apoptosis. in the presence of anti-apoptotic cytokines (eg, insulin-like growth factor-1) or negative regulators of apoptosis (eg bcl-2), c-myc drives cellular proliferation. in the absence of these factors, the c-myc family promotes apoptosis. 10 survival signals, such as growth factors and other soluble mediators, are often released by neighboring cells. hematopoietic cell lines and differentiated cells are dependent on survival factors like granulocyte macrophage colony stimulating factor, granulocyte colony stimulating factor, il-3, and erythropoietin. 10 t and b lymphocytes are dependent on certain interleukins, such as il-7 and il-2, to mature properly or lead to cellular differentiation. 57 the role of environmental survival substances leads to the possibility of dysregulation of apoptosis in the presence of inappropriate survival signals, as observed in tumors and sepsis. 12, 107, 108 apoptosis in disease dysregulated apoptosis best describes pathologic disease states that induce or inhibit cell death inappropri-ately. in humans, excessive apoptosis is linked to stroke and alzheimer's disease and reduced apoptosis to cancer and autoimmune disease. 14, 28, 109 apoptosis also plays a key role in sepsis. a review of the current literature in companion animal medicine yields research pertaining to apoptosis in the fields of oncology, orthopedics, and virology, as well as in other disease states. table 3 lists some of the articles of veterinary origin that investigate apoptosis in disease. a feature characteristic of cancer cells is the uncoupling of cell division and cell death; cells that should have died were not properly signaled to do so. 3 oncologic studies are investigating the enhancement of apoptosis to arrest the growth of tumor cells. one mechanism through which normal cellular numbers are maintained is through p53 and its control over-apoptosis. deficiencies in p53 can lead to reduced apoptosis and tumor development. 66, 110 some cancers harbor mutations in the p53 genome or disrupt normal p53 functions whereas others increase or overexpress bcl-2 proteins leading to a cessation of the normal cellular death program. 78 apoptosis in sepsis cellular demise in sepsis can occur through apoptosis as well as necrosis. in 1996, bone first proposed that apoptosis contributes to the multiple organ dysfunction in sepsis. 10, 111 most treatments had been aimed at blunting the over-reactive or pro-inflammatory response. bone proposed that the anergic or hypoimmune aspect of sepsis, when apoptosis becomes most critical, must also be addressed, 111 apoptotic loss of b cells, t cells, and dendritic cells in sepsis decreases antibody production, macrophage activation, and antigen presentation, respectively. 112 leukocytes are responsible for opsonization and phagocytosis of infected cells and antigens at the site of inflammation. neutrophils produce highly toxic and unstable reactive oxygen species and release bactericidal substances. a hallmark of sepsis is the loss of normal apoptosis of neutrophils. this prolonged life produces neighboring cell damage and contributes to activation of pro-inflammatory cytokines. normally, pro-inflammatory mediators (tnf-a, il-1b, il-6, il-8, and ifn-g) released from macrophages and neutrophils have overlapping effects and function to limit damage, combat pathogenic organisms, eliminate foreign antigens, and promote repair. 113 anti-inflammatory cytokines (il-4, il-10, tgf-b, soluble receptors and receptor antagonists) are also quickly released to try to reduce and locally contain the inflammatory response. 114 many of these inflammatory components are the key factors responsible for the dysregulated apoptosis of immune cells in sepsis. key cells involved in the inflammatory process (neutrophils, macrophages, dendritic cells, and lymphoglucocorticoid immunosuppressive effects on canine lymphocytes may involve apoptosis cytes) are also cells targeted for apoptosis. apoptosis of immune cells is normally not a pathologic process because inflammatory cells must be eliminated so that inflammation does not continue unabated. 115, 116 however, in sepsis or other overwhelming inflammatory processes (like trauma 117 and severe burns) there is extensive cell death of lymphocytes 118 and dendritic cells and delayed cell death of neutrophils. this leads to a blunted immune response coinciding with increased cellular damage. neutrophils are one of the first cells to migrate to the site of inflammation with an average half life of 6-12 hours when unstimulated. 119, 120 once a neutrophil is released into circulation, its apoptotic program has been activated. studies have shown that sepsis can shorten as well as prolong the life span of the activated neutrophils (early or delayed apoptosis, respectively). [121] [122] [123] [124] early apoptosis of neutrophils dampens respiratory burst activity and may lessen secondary tissue injury. 125 delayed apoptosis of neutrophils contributes to increased cellular damage, especially in the lung, liver, kidneys, and gastrointestinal tract. acute respiratory distress syndrome (ards) is marked by significant pulmonary accumulation of neutrophils, 126, 127 and is considered to be a direct effect of neutrophil-induced injury. cells retrieved from the lungs of septic patients show reduced rates of neutrophil apoptosis with the degree of inhibition paralleling the severity of sepsis. 128, 129 additional studies have shown that increased apoptosis (via fas/fasl-dependent mechanism) of pulmonary epithelial cells will lead to permeability changes characteristic of ards. 130, 131 the cytokines produced by activated neutrophils summon macrophages to the area of inflammation; cytokines are also produced by tissue macrophages in response to foreign invasion. macrophages are antigenpresenting cells (apc) capable of engulfing foreign material, infected cells, and apoptotic cells through recognition of specific cell surface molecules. dendritic cells, another type of apc, are viewed as the sentinels of the immune system. 132 like macrophages, mature dendritic cells are able to activate lymphocytes through the presentation of antigen. lymphocytes need 2 signals to stimulate differentiation and initiation of the immune response. the first signal is the presentation of antigen thus accounting for the specificity of the response. the second signal involves costimulatory molecules on apcs or apc secretion of cytokines. the costimulatory molecules interact with specific t cell sites producing a pro-apoptotic or anti-apoptotic state. failure of the appropriate second signal after interaction with an apc results in anergy or apoptosis of the lymphocyte. 133, 134 anergy is a state of unresponsiveness to antigen. 112 this is also the mechanism for self-tolerance. 135 immature dendritic cells are capable of ingesting apoptotic cells but this will render them incapable of maturing and stimulating t cells. 136 macrophages and dendritic cells will secrete il-10 after engulfing apoptotic cells. il-10, considered an anti-inflammatory cytokine, selectively blocks the maturation of dendritic cells. 137 this has been shown to suppress the phagocytic activity as well as pro-inflammatory cytokine production of alveolar macrophages. 138 il-6 is also secreted by dendritic cells after ingesting apoptotic cells, leading to autocrine blockage of maturation. 139 this lack of maturation leads to a tolerogenic state with no further stimulation of the immune system. after ingesting apoptotic cells, dendritic cells can mature only when there are danger signals expressed by the apoptotic cell or when dendritic cells are engulfing an excessive number of apoptotic cells. this leads to secretion of pro-inflammatory cytokines il-1b and tnf-a by these signaled dendritic cells. 137 inadequate clearance of surplus apoptotic cells results in these cells becoming necrotic and inducing a pro-inflammatory response. 140 studies have demonstrated apoptosis of intestinal and splenic b cells, cd4 t cells and dendritic cells in sepsis. [141] [142] [143] overwhelming infection should lead to massive clonal expansion of b and t lymphocytes 142 but instead there is significant loss of these cell lines in sepsis. lack of stimulation by apc cells experiencing apoptosis leads to poor b cell and t cell stimulation. these unstimulated t cells are removed by apoptosis. in as many as 30% of bacteremic patients who die from systemic inflammatory response syndrome or multiple organ dysfunction syndrome, no focus of infection is identified. premature b cell and intestinal epithelial cell death through apoptosis in the intestines is one theory to explain intestinal bacterial translocation and loss of the first line of intestinal defense. 10 treatment strategies current human treatment strategies for manipulating apoptosis focus mainly on cancer and sepsis. a pubmed literature search at the time of writing revealed a total of 645 citations for apoptosis and veterinary. most of the studies are experimental at this point but many oncologic studies are manipulating apoptosis in the treatment of their patients (table 3) . in human cancer research, peptidomimetics are in the early stages of experimental study. they are synthetic peptides that are resistant to enzymatic degradation and are being used for their pro-apoptotic effects on bid as well as their antagonism of iaps. 144, 145 in cancer treatments, there are trials attempting to block the overexpression of bcl-2 because it is the cessation of normal apoptosis that leads to the growth of tumors. in addition to the direct effects on the apoptotic programs, discoveries are being made that allow chemotherapeutics to act synergistically with various anti-apoptotic therapeutics. 78 treatment strategies for sepsis had previously targeted the hyperimmune phase rather than the hypoimmune phase. addressing apoptosis can be therapeutically challenging because targeted blockade of apoptosis in lymphocyte populations must be specific enough to primarily target those cell populations undergoing increased apoptosis and to be sufficiently transient to prevent the risk of malignant transformation associated with prolonged blockade of apoptosis. 146 this poses a challenge to try to target specific pathways. attempts at blocking the circulating mediators and cytokines that induce apoptosis have not been successful because of the inherent redundancy and fail-safe mechanisms of the apoptotic pathways. 119 caspase inhibitors show promise because caspases are common factors to many apoptotic pathways. broad-spectrum caspase inhibitors have been shown to prevent lymphocyte apoptosis and improve survival in animal models of sepsis. 13, 147, 148 caution must be expressed, though, because increased dosages of caspase inhibitors can cause cytotoxicity and tnf-a induced injury. 149, 150 gene therapy studies have shown that overexpression of bcl-2 delays or blocks apoptosis and improves survival in septic mice. 10, 151, 152 other therapies target akt, a regulator of cell proliferation and death. mice overexpressing akt have reduced lymphocyte apoptosis and increased survival after cecal ligation and puncture. 153 fas fusion proteins, and attempts at altering gene expression of members of the dr pathways, are also areas of ongoing apoptotic research. 147, [154] [155] [156] [157] recombinant human activated protein c, a product being used successfully in some septic patients, 158,159 may counteract the induction of apoptosis in monocyte 160 and endothelial cell lines, modulating the inflammatory and coagulation cascades during sepsis. [161] [162] [163] [164] [165] [166] it also helps to attenuate the levels of pro-and anti-apoptotic proteins in favor of survival. in contrast to anti-apoptotic strategies, recent studies have addressed the hypothesis that apoptosis in sepsis may in some cases be beneficial by downregulating the inflammatory response. earlier onset of apoptosis may, in fact, favor survival. giarmarellos-bourboulis et al 167 have associated monocyte apoptosis at the onset of sepsis to a favorable outcome due to a decrease in the amount of pro-inflammatory cytokines produced. apoptosis is a normal biologic process necessary to maintain cellular homeostasis. there are characteristic pathways that lead to this form of cell death continually influenced by local cellular events, growth factors, and neighboring stresses. this complicated system has numerous built-in avenues and failsafe mechanisms, including pro-apoptotic and anti-apoptotic factors. during sepsis and cancer, just 2 of the many diseases causing dysregulation of the apoptotic process, cells are either killed too quickly or survive too long. newer therapies, designed to manipulate apoptosis depending on the pathology involved, can promote or delay this form of cellular demise. although an extremely complicated process, understanding the relationship between sepsis and apoptosis will undoubtedly lead to new treatment modalities. apoptosis: a basic biological phenomenon with wide-ranging implications in tissue kinetics the biochemistry of apoptosis robbins and cotran pathologic basis of disease regulationan of cell number in the mammalian gastrointestinal tract; the importance of apoptosis apoptosis and programmed cell death in immunity apoptosis and the immune system apoptosis: a different type of cell death morphological features of cell death cellular apoptosis and organ injury in sepsis: a review apoptosis: controlled demolition at the cellular level immunology of apoptosis and necrosis role of apoptotic cell death in sepsis the mitochondrion: is it central to apoptosis targeting apoptosis pathways in cancer targeted induction of apoptosis in cancer management: the emerging role of necrosis factor-related apoptosisinducing ligand receptor activating agents promoting apoptosis as a strategy for cancer drug discovery new approaches and therapuetics targeting apoptosis in disease drug insight: cancer therapy strategies based on restoration of endogenous cell death mechanisms apoptosis-based therapies for hematologic malignancies cd95 (fas/apo-1) signals ceramide generation independent of effector stage of apoptosis sphingosine kinase signalling in immune cells activation of bax by ceramide is independent of caspases regulatory role of cathepsin d in apoptosis role of mitochondria as the gardens of cell death cell death signalling pathways in the pathogenesis and therapy of haematologic malignancies: overview of apoptotic pathways a conserved xiap-interaction motif in caspase-9 and smac/diablo regulates caspase activity and apoptosis cd95's deadly mission in the immune system ways of dying: multiple pathways to apoptosis textbook of critical care mediators of endoplasmic reticulum stress-induced apoptosis caspase-12 and er-stressmediated apoptosis: the story so far coupling endoplasmic reticulum stress to the cell death program activation of caspase-12, an endoplasmic reticulum (er) resident caspase, through tumor necrosis factor receptor associated factor 2-dependent mechanism in response to er stress an endoplasmic reticulum stress-specific caspase cascade in apoptosis activation and differentiation of autoreactive b-1 cells by interleukin 10 induce autoimmune hemolytic anemia in fas-deficient antierythrocyte immunoglobulin transgenic mice the role of cap3 in cd95 signaling: new insights into the mechanism of procaspase-8 activation protein kinase c regulates fadd recruitment and death-inducing signaling complex formation in fas/cd95-induced apoptosis structural requirements for signal-induced target binding of fadd determined by functional reconstitution of fadd deficiency a mechanism for death receptor discrimination by death adaptors death receptors death receptor signaling a novel protein that interacts with the death domain of fas/apo1 contains a sequence motif related to the death domain fadd, a novel death domain-containing protein, interacts with the death domain of fas and initiates apoptosis cytotoxicity-dependent apo-1 (fas/cd95)-associated proteins form a death-inducing signaling complex (disc) with the receptor caspase-10 is an initiator caspase in death receptor signaling targeting death and decoy receptors of the tumornecrosis factor superfamily caspases: pharmacological manipulation of cell death the role of the bcl-2 protein family in cancer pharmacological manipulation of bcl-2 family members to control cell death caspases: enemies within extrinsic versus intrinsic apoptosis pathways in anticancer chemotherapy bcl-x(l) forms an ion channel in synthetic lipid membranes the bcl-2 protein family: arbiters of cell survival the role of bcl-2 family members in the progression of cutaneous melanoma death and anti-death: tumour resistance to apoptosis apoptosis in the development of the immune system bid, a bcl2 interacting protein, mediates cytochrome c release from mitochondria in response to activation of cell surface death receptors cleavage of bid by caspase 8 mediates the mitochondrial damage in the fas pathway of apoptosis caspase cleaved bid targets mitochondria and is required for cytochrome c release, while bcl-xl prevents this release but not tumor necrosis factor-r1/ fas death bid-dependent and bid-independent pathways for bax insertion into mitochondria double identity for proteins of the bcl-2 family bcl-2 family proteins regulate the release of apoptogenic chytochrome c by the mitochondrial channel vdac in vivo mitochondrial p53 translocation triggers a rapid first wave of cell death in response to dna damage that can precede p53 target gene activation cell death induced by acute renal injury: a perspective on the contributions of apoptosis and necrosis new tricks of an old molecule: lifespan regulation by p53 apoptosis inducing factor (aif): a phyologenticially old, caspase-independent effector of cell death p53 expression and environmental tobacco smoke exposure in feline oral squamous cell carcinoma targeting of the c-abl tyrosine kinase to mitochondria in endoplasmic reticulum stress-induced apoptosis bax and bak can localize to the endoplasmic reticulum to initiate apoptosis human ice/ ced-3 protease nomenclature autoproteolytic activation of pro-caspases by oligomerization membrane oligomerization and cleavage activates the caspase-8 (flice/machalpha1) death signal granzyme b directly and efficiently cleaves several downstream caspase substrates: implications for ctl-induced apoptosis lamin proteolysis facilitates nuclear events during apoptosis caspase-dependent proteolysis of integral and peripheral proteins of nuclear membranes and nuclear pore complex proteins during apoptosis apoptotic dna fragmentation apoptosis: a basic biological phenomenon with wide-ranging implications in human disease autoantigens as substrates for apoptotic proteases: implications for the pathogenesis of systematic autoimmune disease identification of di-ablo, a mammalian protein that promotes apoptosis by bindingn to and antagonizing iap proteins smac, a mitochondrial protein that promotes cytochrome c-dependent caspase activation by eliminating iap inhibition mitochondria, the killer organelles and their weapons the mitochondrion in cell death control: certainties and incognita mitochondrial effectors in caspase-independent cell death a mitochondrial pool of sphingomyelin is involved in tnfalpha-induced bax translocation to mitochondria ceramide induces bcl2 dephosphorylation via a mechanism involving mitochondrial pp2a ceramide induces mitochondrial activation and apoptosis via a bax-dependent pathway in human carcinoma cells the tnfr2-traf signaling complex contains two novel proteins related to baculoviral inhibitor of apoptosis proteins suppression of tumor necrosis factor-induced cell death by inhibitor of apoptosis c-iap2 is under nf-kappab control nf-kappab antiapoptosis: induction of traf1 and traf2 adn c-iap1 and c-iap2 to suppress caspase-8 ativation iaps block apoptotic events induced by caspase-8 and cytochrome c by direct inhibition of distinct caspases human iap-like protein regulates programmed cell death downstream of bcl-xl and cytochrome c caspase activation, inhibition and reactivation: a mechanistic view bax-induced caspase activation and apoptosis via cytochrome c release from mitochondria is inhibitable by bcl-xl bcl-2, bcl-xl and adenovirus protein e1b19kd are functionally equivalent in their ability to inhibit cell death a conserved xiap-interaction motif in caspase-9 and smac/diablo regulates caspase activity and apoptosis baiting death inhibitors cell death signalling pathways in the pathogenesis and therapy of haematologic malignancies: overview of apoptotic pathways inhibition of fas death signals by flips adhesion-mediated intracellular redistribution of c-fas-associated death domain-like il-1-converting enzyme-like inhibitory protein-long confers resistance to cd95-induced apoptosis in hematopoietic cancer cell lines constitutive expression of c-flip in hodgkin and reed-sternberg cells an inducible pathway for degradation of flip protein sensitizes tumor cells to trail-induced apoptosis selective inhibition of flice-like inhibitory protein expression with small interfering rna oligonucleotides is sufficient to sensitize tumor cells for trail-induced apoptosis flip prevents apoptosis induced by death receptors but not by perforin/granzyme b, chemotherapeutic drug, and gamma irradiation structure, regulation and funtion of nf-kappa b an essential role for nf-kappab in preventing tnf-alpha-induced cell death apoptosis, cross-presentation, and the fate of the antigen specific immune response cellular response to oxidative stress: signaling for suicide and survival clearance of apoptotic and necrotic cells and its immunologic consequences p53 biological network: at the crossroads of the cellular-stress response pathway and molecular carcinogenesis sir isaac newton, sepsis, sirs, and cars the pathophysiology and treatment of sepsis anti-inflammatory cytokines apoptosis in the development and maintenance of the immune system negative selection -clearing out the bad apples from the t-cell repertoire trauma: the role of the innate immune system accelerated lymphocyte death in sepsis occurs by both the death receptor and mitochondrial pathways apoptosis in sepsis: a new target for therapeutic exploration pathological aspects of apoptosis in severe sepsis and shock? dysregulated expression of neutrophil apoptosis in the systemic inflammatory response syndrome circulating mediators in serum of injured patients with septic complications inhibit neutrophil apoptosis through up-regulation of protein-tyrosine phosphorylation interleukin-10 counterregulates proinflammatory cytokine-induced inhibition of neutrophil apoptosis during severe sepsis upregulation of reactive oxygen species generation and phagocytosis, and increased apoptosis in human neutrophils during severe sepsis and septic shock impairment of function in aging neutrophils is associated with apoptosis neutrophils in the pathogenesis of sepsis the acute respiratory distress syndrome immune protection against septic peritonitis in endotoxin-primed mice is related to reduced neutrophil apoptosis neutrophil apoptosis in acute respiratory distress syndrome soluble fas ligand induces epithelial cell apoptosis in humans with acute lung injury (ards) silencing of fas, but not caspase-8, in lung epithelial cells ameliorates pulmonary apoptosis, inflammation, and neutrophil influx after hemorrhagic shock and sepsis dendritic cell regulation of th1-th2 development apoptosis in sepsis homeostasis and self-tolerance of the immune system: turning lymphocytes off homeostasis and self-tolerance in the immune system: turning lymphocytes off natural adjuvants: endogenous activators of dendritic cells cutting edge: bystander apoptosis triggers dendritic cell maturation and antigen-presenting function alveolar macrophage deactivation in murine septic peritonitis: role of interleukin 10 immunoregulation of dendritic cells phagocytosis of apoptotic cells and immune regulation sepsis induces apoptosis and profound depletion of splenic interdigitating and follicular dendritic cells sepsis-induced apoptosis causes progressive profound depletion of b and cd41t lymphocytes in humans depletion of dendritic cells, but not macrophages, in patients with sepsis activation of apoptosis in vivo by a hydrocarbon-stapled bh3 helix a small molecule smac mimic potentiates trail-and tnfalpha-mediated cell death considering immunomodulatory therapies in the septic patient: should apoptosis be a potential therapeutic target? blockade of apoptosis as a rational threapeutic strategy for the treatment of sepsis q-vd-oph, a broad spectrum caspase inhibitor with potent anti-apoptotic properties caspase activation is not death caspase inhibition causes hyperacute tumor necrosis factor-induced shock via oxidative stress and phospholipase a2 overexpression of bcl-2 in transgenic mice decreases apoptosis and improves survival in sepsis mitochondrial membrane potential and apoptosis peripheral blood monocytes in severe human sepsis akt decreases lymphocyte apoptosis and improves survival in sepsis inhibition of fas/fas ligand signaling improves septic survival: differential effects on macrophage apoptotic and functional capacity inhibition of fas signaling prevents hepatic injury and improves organ blood flow during sepsis in vivo delivery of caspase-8 or fas sirna improves the survival of septic mice leukocyte apoptosis and its significance in sepsis and shock hospital mortality and resource use in subgroups of the recombinant human activated protein c worldwide evaluation in severe sepsis (prowess) trial the effect of drotrecogin alfa (activated) on long-term survival after severe sepsis inuence of drotrecogin alpha (activated) infusion on the variation of bax/bcl-2 and bax/ bcl-xl ratios in circulating mononuclear cells: a cohort study in septic shock patients the apoptotic pathway as a therapeutic target in sepsis gene expression profile of antithrombotic protein c defines new mechanisms modulating inflammation and apoptosis recombinant human activated protein c attenuates the inflammatory response in endothelium and monocytes by modulating nuclear factor-kappab leukocyte and endothelial cell interactions in sepsis: relevance of the protein c pathway activated protein c blocks p53-mediated apoptosis in ischemic human brain endothelium and is neuroprotective apoptosis: target for novel drugs early apoptosis of blood monocytes in the septic host: is it a mechanism of protection in the event of septic shock? key: cord-020757-q4ivezyq authors: saikumar, pothana; kar, rekha title: apoptosis and cell death: relevance to lung date: 2010-05-21 journal: molecular pathology of lung diseases doi: 10.1007/978-0-387-72430-0_4 sha: doc_id: 20757 cord_uid: q4ivezyq in multicellular organisms, cell death plays an important role in development, morphogenesis, control of cell numbers, and removal of infected, mutated, or damaged cells. the term apoptosis was first coined in 1972 by kerr et al.1 to describe the morphologic features of a type of cell death that is distinct from necrosis and is today considered to represent programmed cell death. in fact, the evidence that a genetic program existed for physiologic cell death came from the developmental studies of the nematode caenorhabditis elegans.2 as time has progressed, however, apoptotic cell death has been shown to occur in many cell types under a variety of physiologic and pathologic conditions. cells dying by apoptosis exhibit several characteristic morphologic features that include cell shrinkage, nuclear condensation, membrane blebbing, nuclear and cellular fragmentation into membrane-bound apoptotic bodies, and eventual phagocytosis of the fragmented cell (figure 4.1). in multicellular organisms, cell death plays an important role in development, morphogenesis, control of cell numbers, and removal of infected, mutated, or damaged cells. the term apoptosis was fi rst coined in 1972 by kerr et al. 1 to describe the morphologic features of a type of cell death that is distinct from necrosis and is today considered to represent programmed cell death. in fact, the evidence that a genetic program existed for physiologic cell death came from the developmental studies of the nematode caenorhabditis elegans. 2 as time has progressed, however, apoptotic cell death has been shown to occur in many cell types under a variety of physiologic and pathologic conditions. cells dying by apoptosis exhibit several characteristic morphologic features that include cell shrinkage, nuclear condensation, membrane blebbing, nuclear and cellular fragmentation into membrane-bound apoptotic bodies, and eventual phagocytosis of the fragmented cell (figure 4 .1). cell death is central to the normal development of multicellular organisms during embryogenesis and maintenance of tissue homeostasis in adults. 3 during development, sculpting of body parts is achieved through selective cell death, which imparts appropriate shape and creates required cavities in particular organs. in adults, cell death balances cell division as a homeostatic mechanism regulating constancy of tissue mass. deletion of injured cells because of disease, genetic defects, aging, or exposure to toxins is also achieved by apoptosis. in essence, apoptotic cell death has important biologic roles not only in development and homeostasis but also in the pathogenesis of several disease processes. dysregulation of apoptosis is found in a wide spectrum of human diseases, including cancer, autoimmune diseases, neurodegenerative diseases, ischemic diseases, viral infections, 4 and lung diseases. 5 our knowledge of cell death and the mechanisms of its regulation increased dramatically in the past two decades with the discovery nevertheless, necrosis has been shown to occur in cells having defects in apoptotic machinery or upon inhibition of apoptosis, 7 and this form of cell death is emerging as an important therapeutic tool for cancer treatment. 8 autophagy autophagy, which is also referred to as type ii programmed cell death, is characterized by sequestration of cytoplasm and organelles in double or multimembrane structures called autophagic vesicles, followed by degradation of the contents of these vesicles by the cell's own lysosomal system (see figure 4 .1). the precise role of autophagy in cell death or survival is not clearly understood. autophagy has long been regarded as a cell survival mechanism whereby cells eliminate long-lived proteins and organelles. in this regard, it is argued that autophagy may help cancer cells survive under nutrientlimiting and low-oxygen conditions and against ionizing radiation. 9,10 however, recent observations that there is there is early membrane damage with eventual loss of plasma membrane integrity and leakage of cytosol into extracellular space. despite early clumping, the nuclear chromatin undergoes lysis (karyolysis). apoptosis: cells die by type i programmed cell death (also called apoptosis); they are shrunken and develop blebs containing dense cytoplasm. membrane integrity is not lost until after cell death. nuclear chromatin undergoes striking condensation and fragmentation. the cytoplasm becomes divided to form apoptotic bodies containing organelles and/or nuclear debris. terminally, apoptotic cells and fragments are engulfed by phagocytes or surrounding cells. autophagy: cells die by type ii programmed cell death, which is characterized by the accumulation of autophagic vesicles (autophagosomes and autophagolysosomes). one feature that distinguishes apoptosis from autophagic cell death is the source of the lysosomal enzymes used for most of the dying-cell degradation. apoptotic cells use phagocytic cell lysosomes for this process, whereas cells with autophagic morphology use the endogenous lysosomal machinery of dying cells. paraptosis: cells die by type iii programmed cell death, which is characterized by extensive cytoplasmic vacuolization and swelling and clumping of mitochondria, along with absence of nuclear fragmentation, membrane blebbing, or apoptotic body formation. autoschizis: in this form of cell death, the cell membrane forms cuts or schisms that allow the cytoplasm to leak out. the cell shrinks to about one-third of its original size, and the nucleus and organelles remain surrounded by a tiny ribbon of cytoplasm. after further excisions of cytoplasm, the nuclei exhibit nucleolar segregation and chromatin decondensation followed by nuclear karyorrhexis and karyolysis. decreased autophagy during experimental carcinogenesis and heterologous disruption of an autophagy gene, beclin 1 (bcn1), in cancer cells 11, 12 suggest that breakdown of autophagic machinery may contribute to development of cancer. other interesting studies have shed some light on the relationship between autophagy and apoptosis. these investigations have shown prevention of caspase inhibitor z-vad-induced cell death in mouse l929 cells by rna interference directed against autophagy genes atg7 and bcn1 13 and protection of bax −/− , bak −/− murine embryonic fi broblasts against staurosporine-or etoposide-induced cell death by rna interference against autophagy genes atg5 and bcn1. 14 however, both of these studies were done in cells whose apoptotic pathways had been compromised. thus, it remains to be seen whether cells with intact apoptotic machinery can also die by autophagy and whether apoptotic-competent cells lacking autophagy genes will be resistant to different death stimuli. paraptosis has recently been described as a form of cell death characterized by extensive cytoplasmic vacuolation (see figure 4 .1) caused by swelling of mitochondria and endoplasmic reticulum. this form of cell death does not involve caspase activation, is not inhibited by caspase inhibitors, but is inhibited by the inhibitors of transcription and translation, actinomycin d, and cycloheximide, respectively, 15 suggesting a requirement for new protein synthesis. the tumor necrosis factor receptor family taj/troy and the insulin-like growth factor i receptor have been shown to trigger paraptosis. 16 paraptosis appears to be mediated by mitogen-activated protein kinases and inhibited by aip1/alix, a protein interacting with the calcium-binding death-related protein alg-2. 16 autoschizis autoschizis is a recently described type of cell death that differs from apoptosis and necrosis and is induced by oxidative stress. 17 in this type of death, cells lose cytoplasm by self-morsellation or self-excision (see figure 4 .1). autoschizis usually affects contiguous groups of cells both in vitro and in vivo but can also occasionally affect scattered individual cells trapped in subcapsular sinuses of lymph nodes. 18 the nuclear envelope and pores remain intact while the cytoplasm is reduced to a narrow rim surrounding the nucleus. the chromatin marginates along the nuclear membrane, and mitochondria and other organelles around the nucleus aggregate as a result of cytoskeletal damage and condensation of the cytosol. interestingly, the rough endoplasmic reticulum is preserved until the late stages of autoschizis, in which cells fragment and the nucleolus becomes condensed and breaks into smaller fragments. 19 eventually, the nuclear envelope and the remaining organelles dissipate with cell demise. genetic studies in the nematode worm c. elegans led to the characterization of apoptosis. activation of specifi c death genes during the development of this worm results in death of exactly 131 cells, leaving 959 cells intact. 2 further studies revealed that apoptosis can be divided into three successive stages: (1) commitment phase, in which death is initiated by specifi c extracellular or intracellular signals; (2) execution phase; and (3) clean-up phase, in which dead cells are removed by other cells with eventual degradation of the dead cells in the lysosomes of phagocytic cells. 20 the apoptotic machinery is conserved through evolution from worm to human. 21 in c. elegans, execution of apoptosis is mediated by ced-3 and ced-4 proteins. commitment to a death signal results in the activation of ced-3 by ced-4 binding. the ced-9 protein prevents activation of ced-3 by binding to ced-4. 22, 23 mechanisms of apoptosis caspases studies over the past decade have indicated that two distinct apoptotic pathways are followed in mammalian systems: the extrinsic or death receptor pathway and the intrinsic or mitochondrial pathway. the executioners in both intrinsic and extrinsic pathways of cell death are the caspases, 24 which are cysteine proteases with specifi city to cleave their substrates after aspartic acid residues. the central role of caspases in apoptosis is underscored by the observation that apoptosis and all classic changes associated with apoptosis can be blocked by inhibition of caspase activity. to date, 12 mammalian caspases (caspase-1 to -10, caspase-14, and mouse caspase-12) have been identifi ed. 25 caspase-13 was later found to represent a bovine homolog and caspase-11 appears to be a murine homolog of human caspases-4 and -5, respectively. caspases are normally produced as inactive zymogens containing an n-terminal prodomain followed by a large and a small subunit that constitute the catalytic core of the protease. they have been categorized into two distinct classes: initiator and effector caspases. the upstream initiator caspases contain long n-terminal prodomains and one of the two characteristic protein-protein interaction motifs: the death effector domain (ded; caspase-8 and -10) and the caspase activation and recruitment domain (caspase-1, -2, -4, -5, -9, and -12). the downstream effector caspases (caspase-3, -6, and -7) are characterized by the presence of a short prodomain. apart from the structural differences, a prominent difference between initiator and effector caspases is their basal state. both the zymogen and the activated forms of effector caspases exist as constitutive homodimers, whereas initiator caspase-9 exists predominantly as a monomer both before and after proteolytic processing. 26 initiator caspase-8 has been reported to exist in an equilibrium between monomers and homodimers. 27 although the initiator caspases are capable of autocatalytic activation, the activation of effector caspases requires formation of oligomeric complexes with their adapter proteins and often intrachain cleavage within the initiator caspase. caspases have also been divided into three categories based on substrate specifi city. 28 group i members (caspase-1, -4, and -5) have a substrate specifi city for the wehd sequence with high promiscuity; group ii members (caspase-2, -3, and -7 and ced-3) prefer the dexd sequence and have an absolute requirement for aspartate (d) at p4; and members of group iii (caspase-6, -8, and -9 and the "aspase" granzyme b) have a preference for (i/ l/v)exd sequences. several reports have suggested a role for group i members in infl ammation and that of group ii and iii members in apoptotic signaling events. the extrinsic pathway involves binding of death ligands such as tumor necrosis factor-α (tnf-α), cd95 ligand (fas ligand), and tnf-related apoptosis-inducing ligand (trail) to their cognate cell surface receptors tnfr1, cd95/fas, trail-r1, trail-r2, and the dr series of receptors, 29 resulting in the activation of initiator caspase-8 (also known as fadd-homologous ice/ced-3-like protease or flice) and subsequent activation of effector caspase-3 ( figure 4 .2). 30 the cytoplasmic domains of death receptors contain the "death domain," which plays a crucial role in transmitting the signal from the cell's surface to intracellular signaling molecules. binding of the ligands to their cognate receptors results in receptor trimerization and recruitment of adapter proteins to the cell membrane, which involves homophilic interactions between death domains of the receptors and the adapter proteins. the adapter protein for the receptors tnfr1 and dr3 is tnfr-associated death domain protein (tradd) 31 and that for fas, trail-r1, trail-r2, and dr4 is fas-associated death domain protein (fadd). 32 the receptor/ligand and fadd complex in turn recruits caspase-8 to the activated receptor, resulting in the formation of death-inducing signaling complex (disc) and subsequent activation of caspase-8 through oligomerization and self-cleavage. depending on the cell type and/or apoptotic stimulus, caspase-8 can also be activated by caspase-6. 33 activated caspase-8 then activates effector caspase-3. in some cell types, cleavage of caspase-3 by caspase-8 also requires a mitochondrial amplifi cation loop involving cleavage of proapoptotic protein bid by caspase-8 and its translocation to the mitochondrial membrane, triggering the release of apoptogenic proteins from mitochondria into cytosol (see figure 4 .2). in these cell types, overexpression of bcl-2 and bcl-xl can block cd95-induced apoptosis. 34 tumor necrosis factor-α is produced by t cells and activated macrophages in response to infection. although tnf-α-mediated signaling can be propagated through either tnfr1 or tnfr2 receptors, the majority of biologic functions are initiated by tnfr1. 35 binding of tnf-α to tnfr1 causes release of inhibitory protein silencer of death domain protein (sodd) from tnfr1, which enables recruitment of adapter protein tradd. signaling induced by activation of tnfr1 or dr3 diverges at the level of tradd. in one pathway, nuclear translocation of the transcription factor nuclear factor-κb (nf-κb) and activation of c-jun n-terminal kinase (jnk) are initiated, which results in the induction of a number of proinfl ammatory and immunomodulatory genes. 36 in another pathway, tnf-α signaling is coupled to fas signaling events through interaction of tradd with fadd. 37 the tnfr1-tradd complex can alternatively engage traf2 protein, resulting in activation of transcription factor c-jun, which is involved in survival signaling. furthermore, binding of receptor interaction protein to tnfr1 through tradd results in activation of transcription factor nf-κb, which suppresses apoptosis through transcriptional upregulation of antiapoptotic molecules such as traf1, traf2, ciap1, ciap2, and flip. the flice-associated huge protein was identifi ed to be a ced-4 homolog interacting with the ded of caspase-8 and was shown to modulate fas-mediated activation of caspase-8. 38 another class of protein, flip (flice inhibitory protein), was shown to block fasinduced and tnf-α-induced disc formation and subsequent activation of caspase-8. 39 cytotoxic t cells play a major role in vertebrate defense against viral infection. 40 they induce cell death in infected cells to prevent viral multiplication and spread of infection. 41 cytotoxic t cells can kill their targets either by activating the fas ligand/fas pathway or by injecting granzyme b, a serine protease, into target cells. cytotoxic t cells carry fas ligand on their surface but also carry granules containing the channel-forming protein perforin and granzyme b. upon recognizing the infected cells, the lymphocytes bind and secrete granules onto the surface of infected cells. perforin then assembles into transmembrane channels to allow the entry of granzyme b into the target cell. upon entry, granzyme b, which cleaves after aspartate residues in proteins ("aspase"), activates one or more of the apoptotic proteases (caspase-2, -3, -7, -8, and -10) to trigger the proteolytic death cascade (see figure 4 .2). fas ligand/fas and perforin/granzyme b systems are the main apoptotic machinery that regulates homeostasis in immune cell populations. cells can respond to various stressful stimuli and metabolic disturbances by triggering apoptosis. drugs, toxins, heat, radiation, hypoxia, and viral infections are some of the tnf-α tnfr1 complex can also elicit an antiapoptotic response by recruiting traf2, which results in nf-κbmediated upregulation of antiapoptotic genes. in cytotoxic t lymphocyte-induced death, granzyme b, which enters the cell through membrane channels formed by the protein perforin, activates caspases by cleaving them directly or indirectly. intracellular pathways: lack of survival stimuli (withdrawal of growth factor, hypoxia, genotoxic substances, etc.) is thought to generate apoptotic signals through ill-defi ned mechanisms, which lead to translocation of proapoptotic proteins such as bax to the outer mitochondrial membrane. in some cases, transcription mediated by p53 may be required to induce proteins such as bax. translocated bax undergoes conformational changes in the outer membrane to form oligomeric structures (pores) that leak cytochrome c from mitochondria into the cytosol. formation of a ternary complex of cytochrome c, the adapter protein apaf-1, and the initiator caspase-9 results in the activation of caspase-9 followed by sequential activation of effector caspase(s) such as caspase-3 and others. the action of caspases, endonucleases, and possibly other enzymes leads to cellular disintegration. for example, the endonuclease cad (caspase activated dnase) becomes activated when it is released from its inhibitor icad upon cleavage of icad by an effector caspase. antiapoptotic proteins such as bcl-2 and bcl-xl inhibit the membrane-permeabilizing effects of bax and other proapoptotic proteins. cross-talk between extra-and intracellular pathways occurs through caspase-8-mediated bid cleavage, which yields a 15 kda protein that migrates to mitochondria and releases cytochrome c, thereby setting in motion events that lead to apoptosis via caspase-9. the stimuli known to activate death pathways. cell death, however, is not necessarily inevitable after exposure to these agents, and the mechanisms determining the outcome of the injury are a topic of active interest. the current consensus appears to be that it is the intensity and the duration of the stimulus that determine the outcome. the stimulus must go beyond a threshold to commit cells to apoptosis. although the exact mechanism used by each stimulus may be unique and different, a few broad patterns can be identifi ed. for example, agents that damage dna, such as ionizing radiation and certain xenobiotics, lead to activation of p53-mediated mechanisms that commit cells to apoptosis, at least in part through transcriptional upregulation of proapoptotic proteins. 42 other stresses induce increased activity of stress-activated protein kinases, which result ultimately in apoptotic commitment. 43 these different mechanisms converge in the activation of caspases. a cascade of caspases plays the central executioner role by cleaving various mammalian cytosolic and nuclear proteins that play roles in cell division, maintenance of cytoskeletal structure, dna replication and repair, rna splicing, and other cellular processes. this proteolytic carnage produces the characteristic morphologic changes of apoptosis. once the caspase cascade is initiated, the process of cell death has crossed the point of no return. the roles of various caspases in apoptotic pathways and their relative importance for animal development have been examined in genetic studies involving knockout of different caspase genes. a caspase-1 (interleukin [il]-1b converting enzyme [ice]) knockout study suggested that ice plays an important role in infl ammation by activating cytokines such as il-1b and il-18. however, caspase-1 was not required to mediate apoptosis under normal circumstances and did not have a major role during development. 44 surprisingly, ischemic brain injury was signifi cantly reduced in caspase-1 knockout mice compared with wild-type mice, 45 suggesting that infl ammation may contribute to ischemic injury. caspase-3 deficiency leads to impaired brain development and premature death. also, functional caspase-3 is required for some typical hallmarks of apoptosis such as formation of apoptotic bodies, chromatin condensation, and dna fragmentation in many cell types. 46 lack of caspase-8 results in the death of embryos at day 11 with abnormal formation of the heart, 47 suggesting that caspase-8 is required for cell death during mammalian development. in support of this fi nding, knockout of fadd, which is required for caspase-8 activation, resulted in fetal death with signs of abdominal hemorrhage and cardiac failure. 48 moreover, caspase-8-defi cient cells did not die in response to signals from members of the tnf receptor family. 47 however, cells lacking either fadd or caspase-8, which are resistant to tnf-α-mediated or cd95-mediated death, are susceptible to chemotherapeutic drugs, serum depriva-tion, ceramide, γ-irradiation, and dexamethasone-induced killing. 48 in contrast, caspase-9 has a key role in apoptosis induced by intracellular activators, particularly those that cause dna damage. deletion of caspase-9 resulted in perinatal lethality, apoptotic failure in developing neurons, enlarged brains, and craniofacial abnormalities. 49 in caspase-9-defi cient cells, caspase-3 was not activated, suggesting that caspase-9 is upstream of caspase-3 in the apoptotic cascade. as a consequence, caspase-9-defi cient cells are resistant to dexamethasone or irradiation, whereas they retain their sensitivity to tnf-α-induced or cd95-induced death 49 because of the presence of caspase-8, the initiator caspase involved in death receptor signaling that can also activate caspase-3. overall, these observations support the idea that different death signaling pathways converge on downstream effector caspases (see figure 4 .2). indeed, caspase-3 is regarded as one of the key executioner molecules activated by apoptotic stimuli originating either at receptors for exogenous molecules or within cells through the action of drugs, toxins, or radiation. in c. elegans, biochemical and genetic studies have indicated a role for ced-4 upstream of ced-3. 50 upon receiving death commitment signals, ced-4 binds to pro-ced-3 and releases active ced-3. 50 however, when overexpressed, ced-9 can inhibit the activation of pro-ced-3 by binding to ced-4 and sequestering it away from pro-ced-3. therefore, ced-3 and ced-4 are involved in activation of apoptosis, and ced-9 inhibits apoptosis. after the discovery of caspases as ced-3 homologs, a search for activators and inhibitors analogous to ced-4 and ced-9 led to the discovery of diverse mammalian regulators of apoptosis. the plethora of these molecules and their functional diversity allowed them to be classifi ed into four broad categories: (1) adapter proteins, (2) the bcl-2 family of regulators, (3) inhibitors of apoptosis (iaps), and (4) other regulators. as stated earlier, two major pathways of apoptosis, involving either the initiator caspase-8 or the initiator caspase-9 (see figure 4 .2), have been recognized. signaling by death receptors (cd95, tnfri) occurs through a well-defi ned process of recruitment of caspase-8 to the death receptor by adapter proteins such as fadd. recruitment occurs through interactions between the death domains that are present on both receptor and adapter proteins. receptorbound fadd then recruits caspase-8 through interactions between deds common to both caspase-8 and fadd forming a disc. in the disc, caspase-8 activation occurs through oligomerization and autocatalysis. activated caspase-8 then activates downstream caspase-3, culminating in apoptosis. the inhibitory protein, flip was shown to block fas-induced and tnf-α-induced disc formation and subsequent activation of caspase-8. 39 of particular interest is cellular flip, which stimulates caspase-8 activation at physiologically relevant levels and inhibited apoptosis upon high ectopic expression. 51 cellular flip contains two deds that can compete with caspase-8 for recruitment to the disc. this limits the degree of association of caspase-8 with fadd and thus limits activation of the caspase cascade. it also forms a heterodimer with caspase-8 and caspase-10 through interactions between both the deds and the caspase-like domains of the proteins, thus activating both caspase-8 and caspase-10. 52 apoptotic protease activating factor-1 (apaf-1), a ced-4 homolog in mammalian cells, affects the activation of initiator caspase-9. 53 this factor binds to procaspase-9 in the presence of cytochrome c and 2′deoxyadenosine 5′-triphosphate (datp) or adenosine triphosphate (atp) and activates this protease, which in turn activates a downstream cascade of proteases (see figure 4 .2). 54 by and large, apaf-1 defi ciency is embryonically lethal and the embryos exhibit brain abnormalities similar to those seen in caspase-9 knockout mice. 55 these genetic fi ndings support the idea that apaf-1 is coupled to caspase-9 in the death pathway. unlike ced-4 in nematodes, apaf-1 requires the binding of atp and cytochrome c to activate procaspase-9. the multiple wd40 repeats in the c-terminal end of apaf-1 have a regulatory role in the activation of caspase-9. 56 the ced-9 homolog in mammals is the bcl-2 protein. bcl-2 was fi rst discovered in b-cell lymphoma as a protooncogene. overexpression of bcl-2 was shown to offer protection against a variety of death stimuli. 57 the bcl-2 protein family includes both proapoptotic (bcl-2, bcl-xl, bcl-w, mcl-1, nr13, and a1/bfl -1) and antiapoptotic proteins (bax, bak, bok, diva, bcl-xs, bik, bim, hrk, nip3, nix, bad, and bid). 58 these proteins are characterized by the presence of bcl-2 homology (bh) domains: bh1, bh2, bh3, and bh4 (figure 4.3) . the proapoptotic members have two subfamilies: a multidomain and a bh3-only group (see figure 4 .3). the relative ratio of pro-and antiapoptotic proteins determines the sensitivity of cells to various apoptotic stimuli. the best-studied proapoptotic members are bax and bid. exposure to various apoptotic stimuli leads to translocation of cytosolic bax from the cytosol to the mitochondrial membrane. 59 bax oligomerizes on the mitochondrial membrane along with another proapoptotic protein, bak, leading to the release of cytochrome c from the mitochondrial membrane into the cytosol. 60 other proapoptotic proteins, mainly the bh3-only proteins, are thought to aid in bax-bak oligomerization on the mitochondrial membrane. the antiapoptotic bcl-2 family members are known to block bax-bak oligomerization on the mitochondrial membrane and subsequent release of cytochrome c into the cytosol. 60, 61 after release from the mitochondria, cytochrome c is known to interact with the wd40 repeats of the adaptor protein apaf-1, resulting in the formation of the apoptosome complex. seven molecules of apaf-1, interacting through their n-terminal caspase activation and recruitment domain, form the central hub region of the symmetric wheel-like structure, the apoptosome. binding of atp/datp to apaf-1 triggers the formation of the apoptosome, which subsequently recruits procaspase-9 into the apoptosome complex, resulting in its activation 62 . activated caspase-9 then activates executioner caspases, such as caspase-3 and caspase-7, eventually leading to programmed cell death. the iaps, fi rst discovered in baculoviruses and then in insects and drosophila, inhibit activated caspases by directly binding to the active enzymes. 63 these proteins contain one or more baculovirus inhibitor of apoptosis repeat domains, which are responsible for the caspase inhibitory activity. 64 to date, eight mammalian iaps have been identifi ed. they include x-linked iap (xiap), c-iap1, c-iap2, melanoma iap (ml-iap)/livin, iaplike protein-2 (ilp-2), neuronal apoptosis-inhibitory protein (naip), bruce/apollon, and survivin. in mammals, caspase-3, -7, and -9 are inhibited by iaps. 62 there are reports suggesting aberrant expression of iaps in many cancer tissues. for example, ciap1 is overexpressed in esophageal squamous cell sarcoma 65 ; ciap2 locus is translocated in mucosa-associate lymphoid lymphoma 66 and survivin has been shown to be upregulated in many cancer cells. 67 the caspase inhibitory activity of iaps is inhibited by proteins containing an iap-binding tetrapeptide motif. 62 the founding member of this family is smac/diablo, which is released from the mitochondrial intermembrane space into the cytosol during apoptosis. in the cytosol, it interacts with several iaps and inhibits their function. the other mitochondrial protein, omi/htra2, is also known to antagonize xiap-mediated inhibition of caspase-9 at high concentrations. 68 a serine protease, omi/htra2 can proteolytically cleave and inactivate iap proteins and thus is considered to be a more potent suppressor of iaps than smac. 69 it has been reported that the heat shock proteins hsp90, hsp70, and hsp27 can inhibit caspase activation by cytochrome c either by interacting with apaf-1 or other players in the pathway. [70] [71] [72] a high-throughput screen identifi ed a compound called petcm (α-[trichloromethyl]-4-pyridineethanol) as a caspase-3 activator. further work with petcm revealed its involvement in apoptosome regulation. 73 this pathway also includes oncoprotein prothymosin-α and tumor suppressor putative hla-dr-associated proteins. these proteins were shown to promote caspase-9 activation after apoptosome formation, whereas prothymosin-α inhibited caspase-9 activation by inhibiting apoptosome formation. in an apoptotic cell, the regulatory, structural, and housekeeping proteins are the main targets of the caspases. the regulatory proteins mitogen-activated protein/extracellular signal-regulated kinase kinase-1, p21-activated kinase-2, and mst-1 are activated upon cleavage by caspases. 74 caspase-mediated protein hydrolysis inactivates other proteins, including focal adhesion kinase, phosphatidylinositol-3 kinase, akt, raf-1, iaps, and inhibitors of caspase-activated dnase (icad). caspases also convert the antiapoptotic protein bcl-2 into a proapoptotic protein such as bax upon cleavage. there are many structural protein targets of caspases, which include nuclear lamins, actin, and regulatory proteins such as spectrin, gelsolin, and fodrins. 75 degradation of nuclear dna into internucleosomal chromatin fragments is one of the hallmarks of apoptotic cell death that occurs in response to various apoptotic stimuli in a wide variety of cells. a specifi c dnase, cad (caspase-activated dnase), that cleaves chromosomal dna in a caspase-dependent manner, is synthesized with the help of icad. in proliferating cells, cad is always found to be associated with icad in the cytosol. when cells are undergoing apoptosis, caspases (particularly caspase-3) cleave icad to release cad and allow its translocation to the nucleus to cleave chromosomal dna. thus, cells that are icad defi cient or that express caspase-resistant icad mutant do not exhibit dna fragmentation during apoptosis. apoptosis plays a critical role in the postnatal lung. 76 regulated removal of infl ammatory cells by apoptosis helps in the resolution of infl ammation in the lung. 77 recent evidence also supports a role for apoptosis in the remodeling of lung tissue after acute lung injury 78 and in the pathogenesis of chronic pulmonary hypertension, 79 idiopathic pulmonary fi brosis, and chronic obstructive pulmonary disease. 80, 81 acute lung injury/acute respiratory distress syndrome acute lung injury, which clinically manifests itself as the acute respiratory distress syndrome (ards), involves disruption of the alveolar epithelium and endothelium, increased vascular permeability, and edema. two main hypotheses link the pathogenesis of ards to apoptosis, namely, the "neutrophilic hypothesis" and the "epithelial hypothesis." these two hypotheses are not mutually exclusive, and both could play important roles in the pathogenesis of ards. the neutrophilic hypothesis suggests that neutrophil apoptosis plays an important role in the resolution of infl ammation and that the inhibition of neutrophil apoptosis or the inhibition of clearance of apoptotic neutrophils is deleterious in ards. 82, 83 studies in humans showed that bronchoalveolar lavage fl uids from patients with early ards inhibit the rate at which neutrophils develop apoptosis in vitro. 84 the inhibitory effect of bronchoalveolar lavage fl uids on neutrophil apoptosis is mediated by granulocyte/macrophage colony-stimulating factor, and possibly by il-8 and il-2. 85,86 a membrane surface molecule, cd44, has been shown to play an important role in the clearance of apoptotic cells in vivo and in vitro. 87 in a model of bleomycin-induced lung injury, cd44-defi cient mice failed to clear apoptotic neutrophils, which was associated with worsened infl ammation and increased mortality. 87 activation of phagocytic cells inhibits production of proinfl ammatory cytokines, including il-1β, il-8, il-10, granulocyte/ macrophage colony-stimulating factor, and tnf-α and increases release of anti-infl ammatory mediators such as transforming growth factor-β, prostaglandin e 2 , and platelet-activating factor. 88, 89 the net effects of these changes could favor resolution of infl ammation. the epithelial hypothesis suggests that the apoptotic death of alveolar epithelial cells, in response to soluble mediators such as fas ligand, contributes to the prominent alveolar epithelial injury characteristic of ards. several lines of evidence suggest a role for the fas/fas ligand system in epithelial cell apoptosis. 90 fas is expressed on alveolar and airway epithelial cells, 91, 92 and its expression increases in response to infl ammatory mediators such as lipopolysaccharide. fas-mediated lung cell apoptosis is modulated by surfactant protein a, which inhibits apoptosis in vivo. 93 chronic obstructive pulmonary disease chronic obstructive pulmonary disease, caused primarily by smoking, generally refers to chronic bronchitis and emphysema. several factors, including protease/antiprotease imbalance, oxidative stress, cigarette smokederived toxins, and infl ammation mediated by neutrophils, macrophages, and cd8 + t cells, have been shown to contribute to the disease process. furthermore, matrix metalloproteinase 94 and vascular endothelial growth factor receptor inhibition, 95, 96 but not fas/fas ligand, have been shown to play role in the development of emphysema. asthma allergic asthma is characterized by intermittent or persistent bronchoconstriction and has been linked to airway remodeling and chronic infl ammation, with increased numbers of eosinophils, cd4 + t cells, and mast cells. although at present a role for apoptosis in asthma is not confi rmed, studies ex vivo have shown reduced apoptosis of circulating peripheral cd4 + t cells and eosinophils in asthma, which might contribute to infl ammation. corticosteroids used to reduce infl ammation in asthma have been shown to induce eosinophil apoptosis. 97 pulmonary fi brosis is characterized by epithelial damage, fi broblast proliferation, and deposition of collagen. although the mechanism of alveolar epithelial cell apoptosis in pulmonary fi brosis is not known, several reports have suggested fas pathway, 98 angiotensin pathway, 99 activated t cell-derived perforin, 100 il-13 stimulation, 101 and transforming growth factor-β1 activation 102 to play critical roles. because insuffi cient apoptosis is often associated with tumorigenesis, modulation of apoptotic and antiapoptotic targets seems to be an attractive approach to cancer therapy. lung cancers can be divided into small cell lung cancers (sclcs) and non-small cell lung cancers (nsclcs). 103 the sclcs are relatively more sensitive to anticancer drugs and irradiation than are the nsclcs, 104 but the molecular basis for this difference is not clearly known. evaluation of apoptosis-associated substances has shown that caspase-8, fas, and fas ligand are often downregulated in sclcs but not in nsclcs. 105 an investigation of the basis for these differences revealed that there were no differences in the levels of bax and bcl-xl, but the expression of bcl-2 was found to be signifi cantly higher in sclc than in nsclc cell lines. the observation that in some cases bcl-2 can be converted into a proapoptotic bax-like death molecule may offer an explanation for the paradoxic expression of bcl-2 in sclc. 106 the lack of expression of procaspase-1, -4, -8, and -10 107 reported in sclc suggests that these caspases probably do not contribute to spontaneous apoptosis in these cells. apoptosis regulators apaf-1 and procaspase-3 are overexpressed and are functional in nsclc cell lines. in both types of lung cancer, apoptotic stimuli result in cytochrome c release and activation of caspase-9 and caspase-3, but only sclc cell lines showed a relocalization of caspase-3 into the nucleus 108 ; this suggests that the resistance of nsclc cell lines is probably due to defective relocalization of caspase-3. the expression of caspase-9 and caspase-7 in nsclcs was found to be similar to normal lung tissue. 109 however, these cell lines express the apoptosis inhibitor and splice variant of caspase-9 casp9b. in vitro, chemotherapy-resistant nsclc cell lines exhibit decreased caspase-9 and caspase-3 expression, 110 which suggests an inhibition of apoptosis induction via apoptosome formation in nsclc. additionally, both nsclc and sclc cells express high and almost equal levels of survivin. 107 the resistant nsclc cells showed higher expression of c-iap2, and the radiosensitive sclc cells exhibited increased expression of xiap. 111 these results suggest no correlation between the level of expression of the iaps and the difference in the radiosensitivity between nsclc and sclc cells. cell death has become an area of intense interest and investigation in science and medicine because of the recognition that cell death, in general, and apoptosis, in par-ticular, are important features of many biologic processes. involvement of many genes in the death process suggests that cell death is a complex phenomenon with many redundant mechanisms to ensure defi nitiveness. the realization that defective cell death plays a central role in the pathogenesis of diseases has stimulated work on therapies targeted to these processes, and this work will undoubtedly continue in the future. apoptosis: a basic biological phenomenon with wide-ranging implications in tissue kinetics genetic control of programmed cell death in the nematode c. elegans programmed cell death in animal development apoptosis: defi nition, mechanisms, and relevance to disease apoptosis as a therapeutic target for the treatment of lung disease four deaths and a funeral: from caspases to alternative mechanisms dual signaling of the fas receptor: initiation of both apoptotic and necrotic cell death pathways alkylating dna damage stimulates a regulated form of necrotic cell death a novel response of cancer cells to radiation involves autophagy and formation of acidic vesicles autophagy: in sickness and in health tissue protein turnover during liver carcinogenesis reduced autophagic activity in primary rat hepatocellular carcinoma and ascites hepatoma cells regulation of an atg7-beclin 1 program of autophagic cell death by caspase-8 role of bcl-2 family proteins in a non-apoptotic programmed cell death dependent on autophagy genes an alternative, nonapoptotic form of programmed cell death paraptosis: mediation by map kinases and inhibition by aip-1/alix autoschizis: a novel cell death inhibition of the development of metastases by dietary vitamin c:k3 combination autoschizis: a new form of cell death for human ovarian carcinoma cells following ascorbate/menadione treatment. nuclear and dna degradation the molecular biology of apoptosis evolutionary conservation of a genetic pathway of programmed cell death interaction between the c. elegans cell-death regulators ced-9 and ced-4 interaction and regulation of the caenorhabditis elegans death protease ced-3 by ced-4 and ced-9 caspases: enemies within vital functions for lethal caspases mechanism of xiapmediated inhibition of caspase-9 insights into the regulatory mechanism for caspase-8 activation a combinatorial approach defi nes specifi cities of members of the caspase family and granzyme b. functional relationships established for key mediators of apoptosis signalling by cd95 and tnf receptors: not only life and death apoptosis control by death and decoy receptors the tnf receptor 1-associated protein tradd signals cell death and nf-kappa b activation fadd, a novel death domain-containing protein, interacts with the death domain of fas and initiates apoptosis caspase-6 is the direct activator of caspase-8 in the cytochrome c-induced apoptosis pathway: absolute requirement for removal of caspase-6 prodomain two cd95 (apo-1/fas) signaling pathways induction of cell death by tumour necrosis factor (tnf) receptor 2, cd40 and cd30: a role for tnf-r1 activation by endogenous membrane-anchored tnf tumor necrosis factor (tnf) receptor 1 signaling downstream of tnf receptor-associated factor 2. nuclear factor kappab (nfkappab)-inducing kinase requirement for activation of activating protein 1 and nfkappab but not of c-jun nterminal kinase/stress-activated protein kinase involvement of mach, a novel mort1/fadd-interacting protease, in fas/apo-1-and tnf receptor-induced cell death the ced-4-homologous protein flash is involved in fas-mediated activation of caspase-8 during apoptosis viral fliceinhibitory proteins (flips) prevent apoptosis induced by death receptors memory and distribution of virus-specifi c cytotoxic t lymphocytes (ctls) and ctl precursors after rotavirus infection fasdependent cd4 + cytotoxic t-cell-mediated pathogenesis during virus infection transcriptional regulation during p21waf1/cip1-induced apoptosis in human ovarian cancer cells activation of c-jun nh2-terminal kinase/stress-activated protein kinase (jnk/ sapk) is critical for hypoxia-induced apoptosis of human malignant melanoma characterization of mice defi cient in interleukin-1 beta converting enzyme reduced ischemic brain injury in interleukin-1 beta converting enzyme-defi cient mice caspase-3 is required for dna fragmentation and morphological changes associated with apoptosis targeted disruption of the mouse caspase 8 gene ablates cell death induction by the tnf receptors, fas/apo1, and dr3 and is lethal prenatally fadd: essential for embryo development and signaling from some, but not all, inducers of apoptosis reduced apoptosis and cytochrome c-mediated caspase activation in mice lacking caspase 9 the ins and outs of programmed cell death during c. elegans development c-flip(l) is a dual function regulator for caspase-8 activation and cd95-mediated apoptosis the fl ip side of flip apaf-1, a human protein homologous to c. elegans ced-4, participates in cytochrome c-dependent activation of caspase-3 an apaf-1.cytochrome c multimeric complex is a functional apoptosome that activates procaspase-9 apaf1 (ced-4 homolog) regulates programmed cell death in mammalian development autoactivation of procaspase-9 by apaf-1-mediated oligomerization bcl-2 inhibits death of central neural cells induced by multiple agents bcl-2 family proteins role of hypoxia-induced bax translocation and cytochrome c release in reoxygenation injury association of bax and bak homo-oligomers in mitochondria. bax requirement for bak reorganization and cytochrome c release bcl-2 prevents bax oligomerization in the mitochondrial outer membrane mechanisms of caspase activation and inhibition during apoptosis diablo promotes apoptosis by removing miha/xiap from processed caspase 9 iap family proteins-suppressors of apoptosis identifi cation of ciap1 as a candidate target gene within an amplicon at 11q22 in esophageal squamous cell carcinomas the apoptosis inhibitor gene api2 and a novel 18q gene, mlt, are recurrently rearranged in the t(11;18)(q21;q21) associated with mucosa-associated lymphoid tissue lymphomas a novel anti-apoptosis gene, survivin, expressed in cancer and lymphoma a serine protease, htra2, is released from the mitochondria and interacts with xiap, inducing cell death omi/htra2 catalytic cleavage of inhibitor of apoptosis (iap) irreversibly inactivates iaps and facilitates caspase activity in apoptosis hsp27 functions as a negative regulator of cytochrome c-dependent activation of procaspase-3 heat-shock protein 70 inhibits apoptosis by preventing recruitment of procaspase-9 to the apaf-1 apoptosome negative regulation of cytochrome c-mediated oligomerization of apaf-1 and activation of procaspase-9 by heat shock protein 90 distinctive roles of phap proteins and prothymosin-alpha in a death regulatory pathway caspase-dependent cleavage of signaling proteins during apoptosis. a turn-off mechanism for anti-apoptotic signals caspasemediated proteolysis during apoptosis: insights from apoptotic neutrophils programmed cell death contributes to postnatal lung development granulocyte apoptosis and its role in the resolution and control of lung infl ammation apoptosis is a major pathway responsible for the resolution of type ii pneumocytes in acute lung injury mechanisms of structural remodeling in chronic pulmonary hypertension induction of apoptosis and pulmonary fi brosis in mice in response to ligation of fas antigen essential roles of the fas-fas ligand pathway in the development of pulmonary fi brosis granulocyte apoptosis and the control of infl ammation macrophage engulfment of apoptotic neutrophils contributes to the resolution of acute pulmonary infl ammation in vivo modulation of neutrophil apoptosis by granulocyte colony-stimulating factor and granulocyte/macrophage colony-stimulating factor during the course of acute respiratory distress syndrome g-csf and il-8 but not gm-csf correlate with severity of pulmonary neutrophilia in acute respiratory distress syndrome interleukin-2 involvement in early acute respiratory distress syndrome: relationship with polymorphonuclear neutrophil apoptosis and patient survival resolution of lung infl ammation by cd44 macrophages that have ingested apoptotic cells in vitro inhibit proinfl ammatory cytokine production through autocrine/paracrine mechanisms involving tgf-beta, pge2, and paf phosphatidylserinedependent ingestion of apoptotic cells promotes tgf-beta1 secretion and the resolution of infl ammation recombinant human fas ligand induces alveolar epithelial cell apoptosis and lung injury in rabbits fas expression in pulmonary alveolar type ii cells expression of fas (cd95) and fasl (cd95l) in human airway epithelium natural protection from apoptosis by surfactant protein a in type ii pneumocytes upregulation of gelatinases a and b, collagenases 1 and 2, and increased parenchymal cell death in copd inhibition of vegf receptors causes lung cell apoptosis and emphysema oxidative stress and apoptosis interact and cause emphysema due to vascular endothelial growth factor receptor blockade glucocorticoid-induced apoptosis in human eosinophils: mechanisms of action increased circulating levels of soluble fas ligand are correlated with disease activity in patients with fi brosing lung diseases bleomycin-induced apoptosis of alveolar epithelial cells requires angiotensin synthesis de novo the perforin mediated apoptotic pathway in lung injury and fi brosis interleukin-13 induces tissue fi brosis by selectively stimulating and activating transforming growth factor beta(1) early growth response gene 1-mediated apoptosis is essential for transforming growth factor beta1-induced pulmonary fi brosis united states lung carcinoma incidence trends: declining for most histologic types among males, increasing among females progress in understanding the molecular pathogenesis of human lung cancer loss of expression of death-inducing signaling complex (disc) components in lung cancer cell lines and the infl uence of myc amplifi cation conversion of bcl-2 to a bax-like death effector by caspases differences in expression of pro-caspases in small cell and non-small cell lung carcinoma defective caspase-3 relocalization in non-small cell lung carcinoma increased expression of apaf-1 and procaspase-3 and the functionality of intrinsic apoptosis apparatus in non-small cell lung carcinoma rescue of death receptor and mitochondrial apoptosis signaling in resistant human nsclc in vivo expression of inhibitor of apoptosis proteins in small-and non-small-cell lung carcinoma cells key: cord-262681-2voe4r7f authors: kim, moon-young; cheong, harin; kim, hyung-seok title: proposal of the autopsy guideline for infectious diseases: preparation for the post-covid-19 era (abridged translation) date: 2020-08-14 journal: j korean med sci doi: 10.3346/jkms.2020.35.e310 sha: doc_id: 262681 cord_uid: 2voe4r7f with the rapidly spreading coronavirus disease 2019 (covid-19) pandemic over the past few months, the world is facing an unprecedented crisis. innumerable lives have been lost to this novel infectious disease, the nature of which supersedes conventional medical understanding. the covid-19 pandemic is not just a global health crisis, several aspects of life in the post-covid-19 era are also being contemplated. experts in unison are warning that the upcoming changes in all areas of life could potentially be far more drastic than ever experienced in the entire human civilization. the medical community is no exception, and therefore, personnel involved in forensic medicine also need to be adequately prepared for the future. forensic medicine is a branch of medicine dedicated to one of the most important stages of the human lifecycle and has always been at the forefront in times of unprecedented social change. the autopsy, one of the most important tools of forensic medicine, is also useful to infectious diseases because it identifies the causal relationship between death and infection, reveals medical and epidemiological knowledge, and provides objective evidence for legal disputes. we present new autopsy guidelines in forensic medicine, formulated based on the various infectious diseases that we presently live with and may encounter in the future. in formulation of these guidelines several considerations have been taken into account, namely, the role forensic pathologists should play in the post-covid-19 era and the necessary preparations as well as the support needed from society to fulfill that role. the present covid-19 outbreak should be a starting point for formulating improvements in current practices in forensic science, including autopsy biosafety practices and the medicolegal death investigation system. despite the development of medical science, as the complexity of our society increases, various microorganisms that have the potential to be infectious agents constantly threaten humanity. through accumulated mutation, even well-known microorganisms are becoming new species, resulting in stronger transmission or higher number of fatalities. korea is also in a situation where management is required of both the interior spread and the foreign inflow of various infectious diseases. for examples of the former in korea, there are respiratorymediated infectious diseases, such as tuberculosis, which is known to be endemic, and bloodmediated infectious diseases, such as hepatitis b, hepatitis c, and acquired immune deficiency syndrome (aids), which need continuous monitoring. 1,2 numerous foreign infectious diseases are newly emerging as a result of changes in climate and biological distribution due to environmental degradation, and the collapse of interspecies barriers. as international exchange increases, they can flow into other countries at any time. 3 recently, several respiratory diseases caused by novel viruses, such as severe acute respiratory syndrome ( in a situation where enormous social and economic losses are caused by the periodic outbreak of novel infectious diseases, the national quarantine system requires improvement to cope with the public health crisis. we believe that autopsy can provide the basic data for establishing appropriate quarantine and preventive measures. the autopsy is the most direct approach to a disease or other medical abnormalities. historically, a wide range of information on pathogenesis, epidemiology, and the natural course of numerous diseases has been collected through autopsy, leading to the development of medicine. also, the autopsy identifies legal problems related to death and prevents potential disputes, the necessity for which has been recognized across many sectors of society. it should be considered more important in a death related to an infectious disease. while the clinical environment is ready for infectious diseases under administrative and financial support, the death investigation system in korea does not seem to be comparable. although many infectious diseases are diagnosed only postmortem through autopsy, the personnel related to the autopsy are exposed to the risk of infection, due to insufficient clinical information, lack of facilities or equipment for protection, and injury accidents. therefore, a guideline for the standard autopsy for infectious diseases is stated here, which aims to: 1) provide scientific grounds to establish appropriate plans for the prevention and treatment of infectious diseases, 2) contribute to improving national health by controlling the spread of pathogens within the community, and 3) protect human resources engaged in autopsy-related work from the risk of infection. several autopsy guidelines, including more recent ones focusing on covid-19, have been adopted here. [5] [6] [7] [8] [9] most of them suggest that the principles of handling covid-19 during autopsy are not different from that of the handling of other infectious diseases. this guideline does not present the current modus operandi, but indicates the way in which we need to operate from now on, and which needs our continuous effort dedicated to forensics, as well as support from the related social systems. the pathogen of infectious diseases includes various microorganisms, such as bacteria, viruses, fungi, parasites, and even prions. among various routes of transmission, direct contact of blood or body fluids, and aerosol transfer via droplet or its nuclei are considered as important during the autopsy. patients with active infection could have symptoms of acute, subacute, or chronic status, which is called clinical disease, or have no apparent symptoms, which is called subclinical or occult disease. infection by some agents could be inactive for a certain period, which is called latent infection. the diagnosis of an infectious disease could be considered based on 1) medical history, from the statements of his or her acquaintances or formal medical records; 2) postmortem tests for the detection of microorganisms, such as serologic, genetic, or culture tests using blood, secretion, fluid, or tissue; 3) pathologic findings, using conventional and special stains; and 4) epidemiologic information about the deceased or his or her close contacts, such as the location of residence and workplace, occupation, travel history, and recent whereabouts. the infectious disease control and prevention act of korea designated some infectious diseases with epidemiologic importance as 'legal infectious diseases.' these diseases were classified into four classes according to their severity, infectiousness, and isolation level (appendix 1). an emerging infectious disease with the possibility of severe symptoms or rapid transmission is considered an 'emerging infectious disease syndrome' in class 1. covid-19, caused by sars-cov-2, is an example of this temporary classification, which should be classified properly after the pathogenesis and clinical features are further revealed. according to the act, a doctor who identified an infectious disease from a living patient or a dead body should report to the regional public health center. the director of the korea centers for disease control and prevention (kcdc) may order an autopsy of the deceased who is suspected of having died from an infectious disease, to confirm the final diagnosis. the autopsy process should be conducted by a specialist in infectious disease, human anatomy, pathology, or forensic medicine, in a facility with an adequate level of biosafety. the kcdc has suggested a revised classification of the risk groups of infectious agents (appendix 2) in 2016, which is based on the classification for the biology laboratory published by the who in 2004. 10 according to this classification, risk group 2 includes the pathogens that are unlikely to be a serious hazard, such as hbv and hcv, while risk group 3 includes the pathogens that usually cause serious diseases, such as mycobacterium tuberculosis, sars-cov, and hiv. for both groups, effective treatment and preventative measures are available in general. the infectious disease control and prevention act of korea classifies the safety control measures of the facilities handling high-risk pathogens into four grades (appendix 3), which correspond to the biosafety levels (bsls or bls) suggested by the who. they could be applied to all the pathogens identified so far. registration with the kcdc is required for handling high-risk pathogens of grade 1 and 2, while permission from the kcdc is required for those of grade 3 or 4. the classification is as follows: • grade 1: facilities that handle high-risk pathogens that are unlikely to cause diseases to healthy adults. • grade 2: facilities that handle high-risk pathogens that can cause human diseases unlikely to be a serious hazard and for which effective treatment and preventive measures are available. • grade 3: facilities that handle high-risk pathogens that usually cause serious human diseases and for which effective treatment and preventive measures are available. • grade 4: facilities that handle high-risk pathogens that usually cause serious human diseases and for which effective treatment and preventive measures are not usually available. in addition, the same act designates some infectious agents as 'high-risk pathogens' that require special attention from the nation and society, because of the potential of serious risk to public health if used for biological terrorism, or spread to the outside by accident (appendix 4). some agents in risk group 2, 3, and 4 recommended by the who and some causative agents of a recent outbreak, such as sars-cov and mers-cov, are included in this list. it is anticipated sars-cov-2 will be added here in the near future by revision of the act. autopsy plays a critical role in 1) determining the situation and specific causes of death, 2) excluding other causes of death when a patient dies during treatment or isolation for a confirmed infection, and 3) evaluating the medical relationships between infection and death if the infection is not a direct cause of death. autopsy is able to provide crucial information for 1) the establishment of an appropriate treatment plan based on the pathological mechanisms by confirming the clinical course, symptoms, histology, and prognosis, and 2) scientific evidence to control and prevent the spread of pathogens within the community, by identifying the path of transmission, and the prevalence of the target population. since immediately after a particular death, the possibility of legal disputes related to the death are often unclear, and the bereaved family are often confused, the conducting of an autopsy should be decided under careful consideration of the circumstances surrounding the death. potential legal disputes may be related to the validity and relevance of medical treatment or administrative actions, compensation claims against industrial accident insurance or commercial medical insurance, or professional negligence of a business owner. most of the situations are already covered by the criteria for the decision of unnatural deaths suggested by the kslm (appendix 5), or the instructions for handling unnatural deaths declared by the korean national police agency (appendix 6). most of the medical history provided before the autopsy by the police is limited to the statements of bereaved families or acquaintances, or concise data from the national health insurance corporation. obtaining the medical records of the deceased needs an additional effort of the police or the bereaved families. however, the medical information is mandatory in identifying the health status of the deceased, and preparing against the potential risk of infection. the incidence of tuberculosis among autopsy workers is known to be 100-200 times that of the general population, 11 while it has never been investigated in korea. a smallgroup survey in korea indicated that the prevalence of tuberculosis and hepatitis b among medical workers was suspected to be very high. 12 because hepatitis c and aids are difficult to cure and have a poor prognosis, serologic tests are performed on all surgical patients to protect medical personnel. but currently in korea, no particular tests are required to be conducted in the routine for a dead body before the autopsy. the purpose of the medico-legal autopsy may be divided into a judicial one, to confirm criminal relevance, and an administrative one, to manage public issues related to infectious diseases, accidents, or disasters, while that of the clinical autopsy is usually focused on medical evaluation. in korea, the legal basis for all forms of autopsy is prepared. for example, in the cases of infectious diseases that are not expected to be related to crime, the autopsy may be conducted by the minister of health and welfare, the mayor, the governor, the director of the kcdc, or the head of the quarantine office. but in practice, the autopsy is always requested by the police, which inevitably limits its purpose. although autopsy rooms are installed at the national forensic service (nfs) and its local branches, and some medical schools having forensic or pathology departments, their bsls are in different situations. for example, the headquarters of the nfs has a special autopsy room of bsl3, while some medical schools have only bsl1 rooms. in principle, if the deceased is known to be a tuberculosis patient, the autopsy should be conducted in the bsl3 autopsy room, because mycobacterium tuberculosis belongs to risk group 3 with sars-cov and hiv. 10 but this principle is hard to follow, due to the high prevalence of tuberculosis in korea, and a lack of medical history, as mentioned above. each institution is in charge of the management of personal protective equipment (ppe) required for the autopsy, without sharing a standardized protocol. to assess the risk of infection caused by autopsy, the medical conditions of autopsy personnel should be checked periodically, especially after the autopsy of a high-risk person. 5 throughout the branches of the nfs and the universities, there are no principles for the list of target pathogens, the method and frequency of surveillance test, and the criteria for subjects who need such monitoring. to conduct an autopsy, the sequence of procedures should be involved, of 1) transfer from the funeral home, 2) receive the body at the autopsy room, 3) check the identity of the body with the police or bereaved family, 4) perform the autopsy, 5) return the body to the funeral home, and 6) transfer the samples for postmortem test to other departments. the workers who will be involved before and after the autopsy should be guided and trained in the use of ppe and hygiene control, because during the wrapping and transporting processes, there is a possibility of contagion from the deceased. laboratory personnel dealing with samples taken from the body during the autopsy should be aware of the potential risk of infection in all autopsy samples, and receive the same level of health support as the autopsy personnel. in particular, all laboratories dealing with the initial sample that has not been chemically treated or biologically inactivated, must have bsl2 or higher level of facilities and appropriate ppe. 10 throughout the autopsy-related facilities in korea, there is only a low level of safety considerations for these types of personnel who are not directly involved in the autopsy, and the level is insufficient to deal with a body or samples infected by a highrisk pathogen. all bodies should be considered to be infected by unspecified microorganisms, until they are diagnosed as negative by a medically verified examination, using appropriate samples. the autopsy personnel have the right to be protected from infection by the body, for which the affiliated agencies should make appropriate efforts. even if an infectious disease is newly diagnosed after the autopsy, the risk of infection to the autopsy personnel should be low level. a 'standard autopsy' for infectious diseases is defined as an autopsy conducted by an agreed procedure for this purpose, which should always be observed, regardless of the prevalence of the infectious disease. for confirmed cases, some conditions could be added for the optimal protection of the autopsy personnel. in contrast, if any of the facilities, personnel, equipment, or procedure did not meet the standard, an autopsy shall be considered as an 'ordinary autopsy'. the risk of transmission during an autopsy could be assessed according to the infection status of the body ( table 1) . to conduct an autopsy for a confirmed case, the biosafety levels of the facilities for autopsy and laboratory tests should be equivalent to or higher than that of the pathogen. during the prevalence of a certain infectious disease, all the unknown cases should be regarded at least as suspected cases. however, considering the realistic restrictions, if there were reasonable compensations, such as preliminary tests before the autopsy, adequate ventilation and disinfection of the facilities, or additional use of ppes, the autopsy could be conducted by substandard protocols. even for the negative cases that are allowed for the ordinary autopsy, a higher level of protection is recommended, because there is always the possibility of a false-negative. considering the prevalence and biological risks, a list of infectious pathogens should be selected, and periodically evaluated for surveillance. preliminary tests for these pathogens are recommended. during the prevalence of high-risk pathogens (appendix 4) or their equivalents, preliminary tests are mandatory for clinically suspected cases to determine the conduct and the coverage of the autopsy. 4 the autopsy can be postponed until the results of the preliminary tests are available. even for the cases in which the preliminary test was negative, if suspicious findings were found during the autopsy, it is recommended to repeat the test with the autopsy samples. the possibility of false-negatives should always be considered, because the results could be affected by the infection period, sampling methods, status of the samples or bodies, or the characteristics of the test itself. place the body in a leak-proof transparent plastic bag with a thickness of 150 μm, and seal it. do not use pins or clips that can damage the sealing conditions. 13 put the plastic bag into another opaque body bag, and wipe its outer surface with sodium hypochlorite diluted 1:4 (e.g., 5% sodium hypochlorite 100 ml + water 400 ml mix), and dry. attach an identification tag to both the body and its bag, respectively, and make sure that they are not lost. refrigerate the body at 4°c. at the beginning of the autopsy, disinfect the outer and inner surface of the body bag and the skin of the body with 70% alcohol or sodium hypochlorite diluted 1:99 (e.g., 5% sodium hypochlorite 5 ml + water 495 ml mix). the biosafety standard of the autopsy-related facilities may correspond to the bsl in general, although a little modification is required to reflect the procedure and equipment of the autopsy. the concept of bsl is also adopted in the 'standards for the installation and operation of facilities handling high-risk pathogens (ministry of health and welfare notice no. 2019-59)' (appendix 3) , which the korean institutes should follow for handling microorganisms with potential biologic risk. a bsl2 autopsy room is required for the ordinary autopsy, while a bsl3 or higher level is required for the standard autopsy, according to the risk group of the confirmed or suspected pathogen. 10 in an autopsy room that does not meet the above criteria, at least 1) the air inside the autopsy room should not escape to other spaces in the building, 2) the route of exhaust should avoid other intake vent or public spaces, and 3) additional devices or ppes should be utilized to compensate insufficiently met requirements. considering the environment of the autopsy room and the prevalence status of the time, the preliminary test of all the requested bodies should be considered for certain pathogens, and be referred to the decision of the conduct and coverage of the autopsy. waste generated in all processes related to the body correspond to medical waste. they should be immediately disposed of in a dedicated envelope or containerboard box. in particular, sharp tools, such as injection needles, suture needles, or scalpels, should be discarded in a dedicated plastic container. waste are sealed, disinfected, and then refrigerated in a dedicated warehouse. they should be transported to a medical waste incinerator within 7 days, and disposed of within 2 days. if a surface is contaminated, wipe it with sodium hypochlorite diluted 1:49 (e.g., 5% sodium hypochlorite 10 ml + water 490 ml mix), and leave it for 15-30 minutes, before wiping it again with water. if a metal surface is to be disinfected, wipe it with 70% alcohol (e.g. 100% alcohol 70 ml + water 30 ml mix). if a surface is visibly contaminated by blood and body fluids, wipe it with sodium hypochlorite diluted 1:4 (e.g., 5% sodium hypochlorite 100 ml + water 400 ml mix), and leave it for 10 minutes, before wiping it again with water. the sodium hypochlorite solution should be newly mixed each time. after disinfection is finished, thorough ventilation is required. reusable surgical garments (e.g., gown, mask) made of cotton could be included in the alternative list of ppes. the cotton contaminated with blood or body fluids should be washed with hot water at 70°c or higher. if unavailable, soak them in sodium hypochlorite diluted 1:49 (e.g., 5% sodium hypochlorite 100 ml + water 4,900 ml mix), and leave them for 30 minutes before washing. handle as gently as possible, to avoid aerosols. aiming to protect the whole body of the autopsy personnel, including respiratory tract, eyes, and hands, from the infection, ppe should be selected in consideration of the nature and the infection route of the pathogen, and the expected situation of possible exposure. ppe should in principle be disposable or single-use, but some items (e.g., powered air-purifying respirators (papr), goggles, face shields, surgical garments made of cotton, and boots or shoes) may be designed for reuse, which should be disinfected or sterilized according to the manufacturer's instructions. keep hair from flowing down, and remove personal accessories, like watches, in advance. to prevent unnecessary contamination, each manual of the ppes, including dressing and undressing orders, should be understood in advance, and properly applied. once the ppes are used, they shall be discarded or disinfected, being regarded as contaminated. hand hygiene shall be carried out before and after dressing or undressing. damaged or contaminated ppes should be discarded, without being reused or stored again. cross-check between the autopsy personnel is recommended of whether the ppes are worn properly, or not. the dressing order of ppe should be as follows: the undressing order of ppe should be as follows: it is recommended to disinfect inner gloves at each step, as during the undressing they may become contaminated. if ppes are found to be damaged, these cases should be considered as exposure to the pathogen, followed by proper management for the personnel. all the processes should be supervised by an experienced forensic pathologist. the number of people who participate in the autopsy should be minimized. however, it is recommended that at least two people be present in the autopsy room, in case of an emergency. 8 to prevent cutting injuries, the dissection of each body part should be conducted by only one person at a time. a person who is not directly participating in the autopsy, such as the bereaved family member or police, is restricted from entering the autopsy room. during the prevalence of certain infectious diseases or the autopsy for confirmed cases, the access of trainees, such as medical students or residents, is also restricted. if necessary, observation through a window or a monitor is recommended, in a completely separate space from the autopsy room. all the autopsy personnel should be cautious with sharp objects, such as scalpels, knives, needles, or bone sections, which can cause cutting injuries. damaged or contaminated ppes should be immediately discarded, and replaced with new ones. in the case of exposure to infection source, disinfect the exposed area immediately in a proper way; and if there is medical evidence, start prophylactic treatment. if the body is suspected to have an airborne disease, the following should be operated with special caution, to prevent aerosols: 1) cutting bone with electronic saws, for which replacement if a full-body suit is not available, surgical cap and long boots can be used to minimize exposed parts. also, if the suit or gown is not made of waterproof material, the waterproof function of the ppes can be supplemented with a plastic apron or arm covers; b although they are not truly cut-proof, work gloves made of cotton may interrupt the movement of blades. by manual saws or additional use of vacuum inhalers is recommended, 2) opening the containers or centrifugation of samples, 3) body movement during transportation or postural adjustment during autopsy, which may cause spout of oral and nasal contents, 4) incision of the bronchus or lung parenchyma, which may expose the secretion inside, and 5) washing the body with a showerhead, which may spray its body fluids or adhesives together. for the suspected cases or the unknown/negative cases with suspicious findings in the autopsy, a medically verified test at the time for each pathogen or disease should be requested, with appropriate samples. in these cases, the autopsy personnel and facilities should be managed as if they participated in the autopsy for confirmed cases, until the test result is assured to be negative. the initial sample, which is not chemically treated nor biologically inactivated, should be handled within the biosafety cabinet installed in the bsl2 laboratory by experienced personnel wearing the ppes equivalent to that used in the autopsy room. meanwhile, after chemical treatment or biologic inactivation, the samples can be handled on an ordinary bench. purified dna or protein can be handled in the bsl1 laboratory, but the use of a biosafety cabinet or its equivalent is recommended. to transport the autopsy samples, they should be prepared in the following order: 1) put the samples into the primary container, and seal it, 2) disinfect the outer surface of the primary container with 70% alcohol, and label it with an identification tag, 3) wrap the primary container with an absorbent (e.g., paper towel), 4) put the primary container into the secondary container, and seal it, 5) put the secondary container into the tertiary container and seal it, and then label it with a tag. the personnel who pack or open the containers should wear the ppes equivalent to that used in the autopsy room or laboratory. any work that requires contact with the containers, for example, simple transportation in sealed status, requires at least the wearing of gloves. the affiliated agency should recognize the major infection history of all personnel who participate in autopsy or handle postmortem samples, and take necessary measures to prevent infections. if the standard autopsy was conducted without any damage of ppes, the risk of infection is generally low. however, if the biologic nature of the pathogen or the epidemiology and pathophysiology of the disease are not fully identified, all the participants should be alert during the expected incubation period, even though they are not obviously exposed, with self-monitoring of the symptoms and the minimizing of face-to-face contacts. all the personnel who have accessed the autopsy room should be recorded: not only the direct participants in the autopsy, but also assistants for the maintenance of the facilities. considering the prevalence and biological risk, a list of infectious pathogens should be selected, and periodically evaluated for surveillance. if there is clinical evidence, prophylaxis, like vaccination, is recommended. in particular, each participant in the autopsy of confirmed cases should check whether he or she is already infected with the pathogen or not, so that if he or she is infected during the autopsy, the infection source could be traced. in the case of personnel who participated in the autopsy of a body confirmed to be infected but the standard protocol was followed, there is no possibility of exposure, so only selfmonitoring of the symptoms and the minimizing of face-to-face contacts during the expected incubation period are required. however, if the autopsy procedures failed to meet the standard protocol, or the ppes were damaged, infection should be suspected. in this case, self-isolation during the expected incubation period, and if available, prophylaxis, is required. the relevant personnel should be tested for the pathogen at the time point when related symptoms are shown, or the isolation period is nearly ended. if a test was requested after the autopsy, but the results are pending, the same actions are required in the interim. if the autopsy procedures failed to meet the standard protocol, or the ppes were damaged, the forensic pathologist in charge of the autopsy may consider adjusting the participant members, or discontinuing and delaying the autopsy schedule, to protect the autopsy personnel. if a test for certain infection was requested after the autopsy, the process and the result should be shared with all the personnel who had, or would have, contact with the body, including the police, the bereaved family, the person who has discovered, reported, inspected, or transported the body, and the funeral staff. they are required to minimize faceto-face contacts, until the test result is confirmed. covid-19 is a respiratory syndrome caused by the infection of sars-cov-2, which belongs to the coronavirus family. currently, in korea, covid-19 is regarded as an 'emerging infectious disease syndrome,' which is included in class 1 legal infectious disease (appendix 1), and sars-cov-2 is considered a high-risk pathogen, which needs 'urgent management' (appendix 4) . it is known to be transmitted through aerosols, droplets, or direct contact, while the viruses have also been found in tears and feces. 14, 15 the incubation period is up to 14 days, and symptoms were expressed within 12.5 days after exposure in 95% of the infected. according to the studies published so far, the survival period of sars-cov-2 is 3 hours in aerosols, 4 hours on copper surfaces, 24 hours on cardboard surfaces, and 2-3 days on plastic or iron surfaces, which indicate that sars-cov-2 can survive for a considerable period outside of the host. 16 2) clinical and pathologic findings sars-cov-2 patients show diverse symptoms, ranging from asymptomatic to severe respiratory failure. major symptoms are fever, fatigue, dry cough, muscle ache, and shortness of breath; and a few cases included sputum, headache, hemoptysis, and diarrhea. recently, the cdc of the united states and the kcdc added ageusia and anosmia as major symptoms of covid-19. the patients are frequently diagnosed with viral pneumonia, regardless of the actual severity of their symptoms. as of yet, there is no specific therapeutic agent or vaccine. severe patients suffer from respiratory failure, septic shock, and multiple organ failure. the median time to respiratory failure was 8.0 days from symptom onset, while that to mechanical ventilation was 10.5 days. according to a few reports of autopsy or histopathology test, microscopic findings included diffuse alveolar damage, fibromucinous exudates, inflammatory infiltration in the interstitium or intra-alveolar area, viral cytopathic-like change, and thrombogenic vasculopathy. 17-21 a diagnostic test for sars-cov-2 could be considered based on 1) medical history or symptoms, which are mainly fever or respiratory symptoms, and also include headache, abdominal pain, and fatigue, 2) epidemiologic connection, such as temporal, spatial, or geographical relationships with an epidemic region or confirmed patient, and 3) gross pathologic findings of the lungs, such as consolidation, thick exudates, excessive mucus, or other findings suggestive of acute or severe pneumonia, regardless of the clinical symptoms. the autopsy of confirmed and suspected cases should be conducted at bsl3 or equivalent facilities. for unknown cases, the autopsy could be conducted under bsl3 facilities, but there should be reasonable compensations, such as preliminary tests before the autopsy, adequate ventilation and disinfection of the facilities, or additional use of ppes. the management of autopsy related facilities follows the standard autopsy protocol as mentioned above. even though the generation of droplets or aerosols from bodies is unlikely, it is recommended to minimize direct contact with the bodies or postmortem samples, and prevent damage of ppes. 22 for pathologic study, the respiratory system, including proximal and distal trachea, pulmonary hilum, main and segmental bronchi, pulmonary parenchyma, and other organs, such as the heart, the liver, kidney, spleen, and intestines, could be sampled, according to the purpose of the study. fix them with 10% formalin for 2-3 days. 25 (1) the autopsy for confirmed cases the management of facilities, environment, and human resources follows the standard autopsy protocol suggested above. the initial samples should be sent to the bsl3 laboratory. if there is a risk of infection due to damage of ppes, skin exposure, aerosol-prone manipulation, or cutting injuries, the relevant autopsy personnel and his or her contacts should be provided with proper medical treatment, including disinfection and virus test. also, for 2 weeks from the exposed time point, which is the expected incubation period of covid-19, they should be isolated and excluded from the work, even though the initial test result is negative. the affiliated agency should monitor his or her symptoms. (2) the autopsy for suspected or unknown cases if a virus test for the body is carried out during the autopsy, the participated autopsy personnel should minimize face-to-face contact, until the results are notified. if the result of the virus test is positive, a postmortem test, such as toxicology (except for alcohol), biochemical, or genetic test should be requested, after the disinfection of the samples by mixing with 100% alcohol for 3 (sample):7 (alcohol) ratios. 24 then the samples should be transported according to the standard protocol above. the management of the facilities, environments, and human resources generally follows the standard protocol above, while the disinfection process of the autopsy room and related facilities should refer to the kcdc guideline. if an infection is suspected, for example, due to a substandard autopsy procedure or damaged ppes, the autopsy personnel and their contacts should be provided with proper medical treatment, including disinfection and virus test, with isolation and monitoring for 2 weeks, as mentioned above in section 6)(1) . when the body is confirmed to be negative for the virus test, the isolation and monitoring could be discontinued. since the autopsy personnel are under constant risk of infection, there should be consistent effort for the implementation of the standard autopsy guidelines. first of all, expecting the periodic spread of infectious diseases in the future, the preparation of adequate level of ppes 13/25 https://jkms.org https://doi.org/10.3346/jkms.2020.35.e310 and bsl of the autopsy related facilities, and the establishment of a health monitoring and surveillance system are required. to compensate for the problems of the current death investigation system, which is focused on the judicial purpose, in the short term, the range of unnatural death considered as the subject of judicial autopsy should be expanded as wide as possible, in consultation with the police and the prosecution. in the long term, the autopsy request ordered by the directors of the ministry of health and welfare or the kcdc should be encouraged with systemic supports. also, the ministry of health and welfare or the kcdc should be in charge of the management of biosafety requirements in the autopsy facilities and the arrangement of the qualified human resources and financial support, so that this guideline could be satisfactorily implemented. forensic medicine has developed and gradually improved over a long period, despite all the difficulties such as unfavorable environment and systemic constraints. however, in the upcoming post-covid-19 era, there should be more integrated and organized provision, especially against the risk of infectious diseases. health authorities and forensic pathologists should work together to improve the autopsy environment and the death investigation system, so that a better national health system can be established in the near future. trends in infectious disease mortality the korean society of infectious diseases. guidelines for potential emerging infectious diseases in korea. seoul: the korean society of infectious diseases republic of korea biosafety considerations for autopsy interim guidance for collection and submission of postmortem specimens from deceased persons under investigation (pui) for covid-19 briefing on covid-19: autopsy practice relating to possible cases of covid-19 guide to forensic pathology practice for death cases related to coronavirus disease 2019 (covid-19) (trial draft) central disaster management headquarters and central disease control headquarters. corona virus infection-19 response guideline. 7-4th ed. cheongju: central disaster management headquarters and central disease control headquarters mycobacterium tuberculosis at autopsy--exposure and protection: an old adversary revisited occupational infections of health care personnel in korea review article: gastrointestinal features in covid-19 and the possibility of faecal transmission evaluation of coronavirus in tears and conjunctival secretions of patients with sars-cov-2 infection aerosol and surface stability of sars-cov-2 as compared with sars-cov-1 pathological findings of covid-19 associated with acute respiratory distress syndrome covid-19 autopsies withdrawn: mortality of a pregnant patient diagnosed with covid-19: a case report with clinical, radiological, and histopathological findings complement associated microvascular injury and thrombosis in the pathogenesis of severe covid-19 infection: a report of five cases death in the course of judicial execution, such as arrest, interrogation, detention, etc death at the accommodation for health, welfare, and nursing, etc 1. the term "unnatural death" means a death falling under any of the following whose cause is unclear:a. death suspicious, or confirmed to be related to a crime b. accidental death due to natural disaster, traffic accident, safety accident, industrial accident, fire, drowning, etc. c. suicide, or death suspected as suicide d. death in the course of judicial execution, such as arrest, interrogation, detention, etc. e. death at the accommodation for health, welfare, and nursing f. death suspected as acute poisoning by drug, pesticide, alcohol, gas, etc. g. other death with unknown cause 2. the term "unnatural death case" means a case in which one or more bodies that correspond to, or are suspected of unnatural death, are found. 1. the director of an unnatural death case shall apply for a warrant for an autopsy in any of the following cases (referred to as a "priority control case"), unless there are special circumstances: a. death suspected to be by murder b. unidentified body, despite the investigation of belongings, fingerprints, etc. at the scene c. death that is expected to draw social attention, such as collective death, child abuse, etc. d. severely decomposed body, so hard to identify injuries or cause of death 2. the director of an unnatural death case shall consider a warrant for an autopsy in any of the following cases (referred to as a "autopsy-considered case"), to confirm the relation to a crime: a. unexpected death of infant or child b. death in the course of judicial execution, such as arrest, interrogation, detention, etc. c. death suspected as acute poisoning by drug, pesticide, alcohol, gas, etc. d. death suspected to be drowning or falling, for which eyewitness or cctv footage is unavailable e. body that is carbonized or skeletonized f. death for which the bereaved family harbors suspicions about the cause g. death by traffic accident suspicious for the relationship to other crime h. death of a person with excessive death benefit, compared to his or her property i. death with disagreement about the cause between the inspection doctor, the investigators, or the director of the case j. other death for which autopsy is required to confirm its cause or circumstance key: cord-259557-n46fbzae authors: richmond, peter; roehner, bertrand m. title: coupling between death spikes and birth troughs. part 1: evidence date: 2018-09-15 journal: physica a doi: 10.1016/j.physa.2018.04.044 sha: doc_id: 259557 cord_uid: n46fbzae in the wake of the influenza pandemic of 1889–1890 jacques bertillon, a pioneer of medical statistics, noticed that after the massive death spike there was a dip in birth numbers around 9 months later which was significantly larger than that which could be explained by the population change as a result of excess deaths. in addition it can be noticed that this dip was followed by a birth rebound a few months later. however having made this observation, bertillon did not explore it further. since that time the phenomenon was not revisited in spite of the fact that in the meanwhile there have been several new cases of massive death spikes. the aim here is to analyze these new cases to get a better understanding of this death–birth coupling phenomenon. the largest death spikes occurred in the wake of more recent influenza pandemics in 1918 and 1920, others were triggered by the 1923 earthquakes in tokyo and the twin tower attack on september 11, 2001. we shall see that the first of these events indeed produced an extra dip in births whereas the 9/11 event did not. this disparity highlights the pivotal role of collateral sufferers. in the last section it is shown how the present coupling leads to predictions; it can explain in a unified way effects which so far have been studied separately, as for instance the impact on birth rates of heat waves. thus, it appears that behind the apparent randomness of birth rate fluctuations there are in fact hidden explanatory factors. this paper is about a remarkable case of birth and death fluctuations which, apart from its own interest, may give new insight in the more general problem of vital rate fluctuations. we begin with the following observation. • in the early 20th century european populations were increasing at an annual rate of around 1%. under the assumption of a constant λ birth numbers will see the same annual change. thus, for monthly changes the rate will be 12 times smaller, i.e. around 0.1%. • in contrast, the monthly birth rate fluctuations of actual observed monthly birth numbers were about 4%. 2 this is 40 times larger than the fluctuations under constant birth rate. however remarkable, the case considered in this paper is possibly just one of several similar processes leading to predictable birth rate changes. the topic of short-term fluctuations has so far attracted much less attention than the study of medium-or long-term changes. for instance the demographic transition in developed countries was brought about by changes of vital rates over a time interval of several decades. the present study shares several important features with the research field that is concerned with the fluctuations of sex ratio at birth, see [1, 2] . • both researches rely crucially on comparative analysis, either in space across different countries or in time over past centuries. • both investigations focus on short-term effects, for instance the changes that occur in the months following an epidemic or a war, see [3] and [4] . • the respective key-variables, monthly birth rates on one hand and sex ratio fluctuations on the other, can be used as markers, in other words as measurement devices which give insight into abnormal situations. for instance, james [5] documents how offspring sex ratios can reveal endocrine disruptions; similarly mortality sex ratios can be used to explore anomalies of the immune system, see [6] and [7] . in any country the time series of monthly births display substantial fluctuations. there is usually a seasonal pattern which is country dependent; in addition for the same month (say october) in different years there are annual fluctuations of about the same magnitude. it is customary to say that these are random fluctuations but are they really random? in this respect one can observe that by saying that a phenomenon is random one gives up ipso facto all attempts to understand it. it is probably for this kind of reason that albert einstein supported the ''hidden variable'' interpretation of quantum physics. in 1935, i.e. some 10 years after quantum mechanics was introduced, einstein et al. [8] suggested that the wave function description of quantum objects was incomplete in the sense that it did not include some hidden parameters. in 1892 jacques bertillon, a pioneer of medical statistics and one of the designers of the ''international classification of diseases'', published an analysis of the influenza pandemic of november 1889-february 1990 in which he showed that approximately 9 months after the climax of the epidemic a temporary birth rate trough (of an amplitude of about 20%) was observed in all countries where the pandemic has had a substantial impact, particularly austria, france, germany or italy. while writing his paper bertillon did not know that some 28 years later there would be a massive influenza pandemic through which his discovery could be tested. we will show below that it was indeed confirmed in all countries where the pandemic has had an impact. apparently, there has been no further studies of this phenomenon ever since. actually, even in bertillon's paper the effect is discussed fairly briefly. in particular its mechanism remains to be uncovered. this is the purpose of the present paper. while in 1889-1990 and 1918-1919 the effect can be detected very clearly, is it not natural to assume that it exists also in less spectacular cases. this raises the following question. in most countries the death rate presents a winter peak in january fig. 1a shows that the population changes based on the summation of monthly births and deaths approximates correctly the observed annual population changes. the curve of birth numbers based on a constant birth rate is naturally parallel to the population curve; it is displayed in the inset graph of fig. 1b ; the variation interval of births numbers (as shown on the vertical axis of the inset) is (9.20, 9.45 ). these variations are much smaller than the actual fluctuations of birth numbers; as a result the broken line of predicted birth numbers looks almost completely flat when displayed on the same graph as the actual births fluctuations. source: bunle [11]. or february. according to the bertillon coupling one would expect a trough of births 9 months later that is to say in october or november. is this indeed the case? if not, in what respect do exceptional death surges differ from regular winter surges? this point will be discussed in the second one of the two papers devoted to this question. 3 in an attempt to get a clearer idea of the mechanism of the coupling effect the present paper will address the following questions. (1) first, we will review and expand the analysis presented in [10] . the question which comes immediately to mind is whether the same effect can be observed in other pandemics. (2) if the answer to the previous question is affirmative it raises another interrogation, namely is this effect restricted to pandemics or does it also exist for other large-scale mortality shocks, for instance famines, earthquakes and so on. wars should not be included in our study for in this case the separation between husbands and wives interferes with the coupling that we wish to observe. (4) a possible mechanism that one can imagine is through a transient impact on marriages. if the marriage rate is reduced during the time of the epidemic, one would indeed expect a reduction in birth numbers some 9 months later. such an explanation can easily be tested provided one can find monthly marriage data (see below). before starting this investigation we need to answer an obvious objection which can be stated as follows. is it not natural that following a reduction in population one sees a fall in the number of births? after this fall the birth numbers will go up again as the population resumes its ascending movement. qualitatively the argument seems satisfactory. if correct, the effect would become trivial. however, in what follows we show that quantitatively the argument is not correct for it explains less than one tenth of the observed birth changes. before coming to more elaborate arguments one can make a simple remark. what we see after 9 months is a fairly narrow dip. on the contrary, a fall in population would produce a permanent reduction, in other words not a dip but a heaviside step. it is true that because of the upward trend due to the overall population growth after a while the births would resume their ascending progression, however the resulting shape would be a broad trough rather than a narrow dip; this difference can be seen clearly in fig. 1a versus 1b . we present the reasoning in two forms. while straightforward, the first is perhaps not very transparent; the second is more theoretical but intuitively clearer. the first argument is very simple. representation of the bertillon coupling effect in the form of an input/output system. if one interprets the death spike as an impulse function, the birth trough can be seen as the impulse response of the system. an important point is that, as explained in the text, the loss of lives during the death spike cannot by itself account for the observed birth trough. its direct effect is at least 10 times too small which means that there must also be an indirect effect; it is the purpose of the paper to identify it more closely. the statistical records provide monthly birth and death numbers, b e (t) and d e (t). thus, starting from a known population at initial time t 0 we can forecast the population p f (t) in all subsequent months simply by summing up the monthly population increases b e (t) − d e (t). it can be checked (fig. 4a ) that p f (t) is indeed consistent with the observed population evolution p e (t) (which amounts to say that in this time interval emigration and immigration do not play a great role). then, under the assumption of a constant birth rate λ the predicted monthly birth numbers will be: b f (t) = λp f (t). when these numbers are compared with the observed monthly birth numbers b e (t) one sees a huge difference in the magnitude of the fluctuations (fig. 1b) . to double check that there is no flaw in the data on which the graph is based consider the start and end points. according to flora et al. [12, p.73 ]: • in 1917 when the mid-year population was 5779 thousands a (constant) annual birth rate of 20 per thousand gives 115,580 births for the year, i.e. an average of 9631 births per month. • in 1920 when the mid-year population has increased to 5875 the same birth rate gives 117,500 births, i.e. an average of 9791 per month. this represents a total predicted birth number increase of 1.67% whereas the observed monthly birth numbers show fluctuations of the order of 30%. the following statement (explained in appendix a) summarizes the situation. proposition under the assumption of a constant monthly birth rate λ the fall in births resulting from a monthly death excess e is given by: ∆b = λe; for european countries in the early 20th century λ ≃ 2%/12 which leads to: ∆b ≃ e/600. the proposition has a fairly clear intuitive interpretation. to make things simple let us assume that the death-excess variable e takes place in one month, as is for instance the case for earthquakes. in a virtual world where everybody conceives once every month, for each missing person there would be a missing baby 9 months later. under this assumption the birth trough would be of same magnitude as the death spike. however we know that actually the probability to conceive in a given month is λ/12 = 0.0017 where λ is the annual birth rate. in other words in 10,000 persons only 17 may conceive in the relevant month. thus, this direct effect contributes very little to the birth trough. the birth rate reduction can be attributed to one of the factors mentioned in fig. 2a . the objective of the present paper is to see more clearly which one of these factors plays a leading role. what real-life mechanisms can one think of that may explain the temporary birth rate fall? in fig. 2a there is a distinction between social and biological factors. it is by comparing different case studies that we came to the conclusion that social factors play a key-role. how? as famine and influenza have biological effects, at first sight the resulting birth reduction might also be attributed to such effects. on the contrary, birth reductions after earthquakes can hardly by accounted for by biological factors. for that reason the case of the tokyo earthquake of 1923 played a key role in our understanding. it convinced us that the suffering of the many people who are affected but do not die was of central importance. in examining successive case-studies it will also be seen that the magnitude of the birth reduction is compatible with the existence of a group of people that we call collateral sufferers. this term refers to persons who are affected by the event under consideration (whether epidemic, disease or anything else) but do not die. in order to generate the observed birth dips the size of this group must be several times larger than the number of fatalities. this is illustrated in fig. 2b . in sweden although there were only 43,000 excess deaths in 1918, it is estimated that about one third of the population, that is to say almost 2 millions (50 times more than those who died), was affected by the disease. naturally, this fraction is difficult to define precisely because there is a continuum between the persons who were not affected at all and those who experienced very mild forms of the disease. the two following points need to be emphasized. • among the persons affected by the disease (or even in the whole population), there may have been a more restrained sexual behavior for instance by fear of a possible contagion. this may have produced a reduction in the number of conceptions. • among the persons mildly affected by the disease there may have been biological effects leading to less fecundation or early miscarriages in the 2 or 3 first months of pregnancy. the fact that in what follows we will observe the bertillon birth effect not only for diseases but also for earthquakes may seem to speak against such biological effects. however, one cannot exclude possible psychosomatic effects resulting from a stressful situation. an example is the so-called famine amenorrhea [13] [14] [15] [16] . finally it should be mentioned that the birth phase represented in fig. 2a comprises in fact two cases: natural birth and medically assisted birth through caesarean or induced delivery. such medical interventions are much more frequent on ordinary working days than on holidays. for that reason, there is a drastic reduction in daily birth numbers on saturdays, sundays and on holidays. for instance in the united states; 1 january. 4 july, labor day, thanksgiving, 24-25 december are marked by a reduction of about 30%. this effect is of great importance when one uses daily data; at the level of monthly data the effect is ''diluted'' because the deliveries are simply postponed by a few days, thus the total monthly figure should not be affected. at first sight one might think that birth dates may be critically affected by marriage dates. however, in the discussion of appendix b it is shown that this effect is fairly weak. one reason for that is because only first born children are concerned. thus, in the 19th or early 20th century when having four or five children was common, only a small fraction of the births were affected. it is currently said that, as indicated in fig. 2a , pregnancy lasts 9 months which corresponds to 9 × 365/12 = 274 average days. however this statement raises two questions. (1) for this 9-month estimate what starting point is considered? (2) what is the dispersion of pregnancy around its average? this question is of importance because it affects the width of the birth trough. these questions are discussed in appendix c; it appears that the average time interval between sexual intercourse and birth can be taken equal to: the fact that the standard deviation is of the order of a few days shows that the dispersion of the births (that is to say the width of the dip) will be mostly due to the dispersion of the conceptions. even for a sharply defined event such as an earthquake, the behavior of collateral sufferers may be modified for several weeks. actually, it is through the width of the dip that we can know how the survivors reacted to the shock in terms of intercourse frequency. in finland the spring of 1867 was unusually cold; in may 1867 the average temperature was only 1.8 • c, which is some 8 degrees below the long-term average. this made it very difficult to grow spring cereals or potatoes. then, in autumn the winter started early and was also colder than average. at the end of 1867 and early 1868 relief grains were imported thanks to a loan of the rothschild bank in frankfurt, however, as often in such cases, the inadequacy of the transportation network hampered delivery. 4 in 1867 the death rate was 38 per thousand, already 35% above average; then in 1868 it climbed to 78 per thousand. just as an element of comparison, it can be mentioned that this rate was 2.8 times higher than the rate of 28 per thousand reached during the famine of 1961 in china. in the breakdown of the deaths according to their causes only 2350 famine deaths were recorded whereas 27,215 deaths were attributed to tuberculosis, dysentery, smallpox and whooping cough. an additional 59,717 were attributed to ''various fevers'' [17, p. 416-417] . these numbers illustrate the zones of fig. 2b ; food scarcity was the triggering factor but it killed very few people directly. with respect to a population of 1.7 million, the excess-deaths of 80,000 correspond to a rate of 47 per thousand. of the mortality cases that we are going to examine, this one is the most massive. thus, one expects the bertillon birth effect to be clearly visible and indeed it is. for the respective peaks the birth-death ratio is approximately 40%/500% = 0.08. in other words, if one considers the mortality increase as the signal and the change in births as the response of the system we have here an output signal which is about 1/10 of the input signal. incidentally, it can be noted that these data were available in bertillon's time but he did not use them. the pandemic of december 1889-january 1890 is described in [10] . in paris the first cases occurred in the week of 17 november 1889. on the basis of death certificates in paris influenza was given as the direct cause of death for only 250 persons. however, during the time of the epidemic there were about 5000 more deaths than in the corresponding period of the previous years. thus, zone 2 of fig. 2b was 20 larger than zone 1. with the population of paris numbering about 3 million, one gets a fatality rate of 1.6 per thousand. whether the epidemic can be called a pandemic is a matter of definition. it is true that there was a death spike in many places (saint petersburg, berlin, vienna, paris, london, new york) but with the exception of paris and london, in all other places it was not more severe than the common annual winter death spike. of the 38 pages of bertillon's report, only two are devoted to the effect on birth numbers 9 months later. from the weekly data that he gives for berlin and paris (as well as some other european capitals) one can draw the following observations. fig. 3a shows the two curves in normal graphical representation; the death scale is on the left and the birth scale on the right (both are expressed in thousands). in fig. 3b the birth curve was shifted 8 months to the left and turned upside down (by turning to negative numbers). as a result this new series represents the conceptions (at this point we do not know why the time lag is rather 8 months instead of 9 months). the bertillon birth effect is clearly visible in the fact that the maximum of the deaths coincides with a minimum of the conceptions. in fig. 3b the correlation of the two series is 0.76. a correction was performed on the monthly data so as to give all months the same length of 365/12 = 30.42 days. all monthly data used in the rest of the paper were corrected in this way before being used. source: finland [17] . fig. 4a the death scale is on the left and the birth scale on the right (both are expressed in thousands). in fig. 4b the birth curve was shifted 9 months to the left and turned upside down. the bertillon birth effect is again clearly visible, this time with the expected time lag of 9 months. in fig. 4b the correlation of the two series is 0.63. source: statistique de la france, nouvelle série. statistique annuelle, various years. • in paris the peak of the epidemic occurred in the first week of january 1890 whereas the lowest point in the trough of birth numbers occurred in the 41st week of 1890. in both cases the time lag is close to the 39 weeks of a normal pregnancy. although bertillon examines the timing with great accuracy he does not discuss the question of the amplitude of the troughs. in particular, he does not show that the troughs cannot be explained solely by the deaths due to the epidemic. that is of course a key point which is why in the previous section we discussed it with great care. although the death toll was much smaller than in the case of finland, the bertillon birth effect appears fairly clearly. what makes the observation more convincing is of course the fact that the trough can be identified not just in a single city (where it might occur almost by chance) but simultaneously in several cities having non-identical seasonal birth fluctuations. in addition to paris and berlin (see twenty eight years after the pandemic of 1889-1890 there was the great influenza pandemic of 1918. the death-birth coupling can be observed in all countries where this disease had a substantial impact. 5 however, we will not examine these cases here because they will be studied closely in the second one of this series of two papers. the particular interest of this case comes from the fact that it was neither due to famine nor to a disease. although called the ''great kanto earthquake'' in japan, it concerned in fact mainly two of the 7 prefectures which constitute the kanto region, namely tokyo and kanagawa (i.e. yokohama just south of tokyo). some 84% of the fatalities were concentrated in these two prefectures. the number of fatalities can be estimated by subtracting the average death numbers of 1922 and 1924 from the death number of 1923, 6 i.e. (in thousands): 1332 − 1270 = 62 [11, p. 441 ]. for the whole of japan the death rate was 1.1 per thousand, in tokyo it was 8.0 per thousand and in yokohama it was 13 per thousand. the earthquake was accompanied by a tsunami with a wave up to 13 meter high and, especially in tokyo, by fire tornadoes. the birth trough is of smaller amplitude than in previous cases and in fact there are two circumstances which play a crucial role in its identification. (1) although the seasonal births have wide fluctuations (in fact much larger than in other countries) their annual repetitions are very regular. (2) as one knows exactly where the trough is expected even a small signal can be identified. a purely statistical analysis that would fail to take into account these circumstances would result in overestimating the size of the confidence interval. incidentally, if monthly data were available at prefecture level one could get a better accuracy. two airliners belonging respectively to united airlines and american airlines were crashed into the north and south towers of the world trade center complex in new york city. 7 within about two hours both buildings collapsed. there were some 3000 fatalities including some 400 firefighters and police officers. with respect to the 8 million population of new york city this corresponds to a rate of 0.37 per thousand. 7 there have been many odd stories and speculations about this attack. however, there is one well documented, indisputable and nevertheless not often mentioned aspect which is the huge amount of put options bought in the days preceding the attack. for stock owners, put options provide a protection against the fall of a stock price because it gives them the right to sell their stock at a predetermined price which may be much higher than the current price. moreover, because the price of a put option increases when the price of the stock declines, speculators can make a profit by selling their put options. more details can be found on the following webpage: http://911research.wtc7.netsept11/stockputs.html. fig. 7 is remarkable because it shows that 9/11 did not produce any birth trough whatsoever. it shows that the death-birth coupling is by no means commonplace. here the zones 1 and 2 of fig. 2b can be merged into one which corresponds to the total death toll. zone 3 would correspond to persons (some 6000) injured but not killed. zone 4 can be seen as comprising the families and close relatives of the persons killed or injured. based on the average size of us households which is of the order of 2.6, one gets for zone 4 a total number of: 2.6 × 9000 = 23,400. of this number only the fraction in the age interval 20-35 would contribute to the birth trough. based on the us population pyramid this fraction is of the order of 15%. for the pandemics considered so far, it is almost impossible to estimate the size of zone 4. however, for an earthquake one can posit that zone 4 corresponds to the persons whose houses have been destroyed or damaged. the outbreak of sars (severe acute respiratory syndrome) in hong kong is interesting for two reasons. it is not only the number of deaths which was small but also the number of cases, namely 1730 (i.e. 260 per million population). worldwide it was the same picture; there were only some 700 deaths compared with the 500,000 who died from influenza in the same year. nonetheless, the city took drastic measures. • primary and secondary schools were shut for a month beginning in late march. • various public places were closed. • financial companies asked their employees not to come to their office and work from home. • a whole residential complex called amoy gardens was put under an emergency quarantine. the residents were sent to a vacancy center and the building was closed. altogether in this complex 329 people were infected and 42 died. • unlike the influenza pandemic of 1918, sars was particularly severe for elderly people. none of the infected females under 30 died whereas among males older than 70 the death rate was 75%. thus, although healthcare workers accounted for 23% of all infected persons [20, p. 14] only few died. despite the small death toll, the death effect of the sars epidemic can be identified fairly clearly because it occurs two months later than the standard winter death outbreaks (fig. 8) . the conception effect can also be identified clearly the trough occurs of shifted births occurs in march rather than in august. the epicenter of the earthquake was under sea near the city of sendai which is 300 km north-east of tokyo. the death toll (including the missing) was about 18,400 and a further 6000 were injured. moreover some 400,000 buildings collapsed or half-collapsed. 8 in other words we can take h = 400,000 as an estimate of the number of people who were directly affected. we will see below that under appropriate assumptions one can use this estimate to derive the expected birth number reduction. the conception trough of march 2011 due to the earthquake appears fairly clearly because it is distinct from the seasonal troughs seen in other years (see fig. 9 ). in all epidemics and disasters, apart from the fatalities, there is a group of persons which is directly affected. for an influenza epidemic this would include the persons who fell seriously ill but did not die. for an attack like 9/11 it would be the family members of those who died or were injured. naturally, the previous dichotomous picture is a simplification. actually, there is a gradual transition from those highly affected to those lightly affected. under such an assumption the reduction ∆b in the number of conceptions would be written: here λ is the ''normal'' conception rate, x is the zone-dependent conception rate and h(x)dx is the number of persons having a conception rate comprised between x and x + dx. the dichotomous argument which gives ∆b = λh would correspond to: h(x) = 2hδ(x) where δ(x) denotes the dirac delta distribution and h the total number of people affected. the function h(x) describes the severity of the shock among the surviving population. if h(x) is concentrated in a narrow interval (λ − ϵ, λ) it means that the surviving population is not much affected. on the contrary, a function h(x) concentrated in a narrow interval (0, ϵ) means a severe incidence among those affected (their reproduction rate falls from λ to almost zero). in the next section we discuss shortly how information about the function h(x) can be derived from observation. table 1 summarizes the death rates of the various case studies considered above. the function h(x) describes the incidence of the death spike in terms of suppressed conceptions and suppressed births. how can one derive information about it from observation? one can offer the following suggestions. • first, is suppressed conceptions identical to suppressed births? not necessarily. there can be a biological regulation which leads to early elimination of embryos which are not in good shape because they were generated in conditions of illness, scarcity or famine. this is a medical question for which one should be able to find reliable information in the medical literature. • the fraction of the population directly affected by the mortality spike comprises the following subgroups. (i) persons killed (k ) (ii) persons injured or who were seriously ill (h 2 ) (iii) persons whose living conditions were affected in a major way, for instance because they lost a family member (h 3a ) or because their home was destroyed (h 3b ). so far, we have focused on death spikes. however, according to the view outlined above there could also be cases displaying birth troughs without any (or at least very few) fatalities but instead a substantial number of persons who become ill, injured or were otherwise severely affected. as possible example we investigated west nile fewer in the united states. this disease which is due to a virus carried by mosquitoes peaks in august-september which would give a birth trough in may. however, we could not identify any significant birth trough even in the states most affected, e.g. texas in 2012. the number of incapacitated persons may be too low. there is no birth trough following 9/11 but there is one in the hong kong sars epidemic whose death rate is about 10 times smaller. this seems fairly puzzling but the two events were of very different kinds. whereas 9/11 was a one-day event, the sars epidemic lasted a few months and, especially during early times, the spread of the disease as well as its severity (in terms of number of deaths per cases) were a matter of uncertainty and concern. thus, even persons who did not become ill were affected. we have shown that sudden death spikes are almost always followed 9 months later by a birth trough. we have also seen that the spike of 9/11 did not lead to the expected birth reduction. this observation is particularly intriguing when compared with the sars outbreak in hong kong in which there were less deaths than in 9/11 and which was nonetheless followed by a birth dip. although a tentative explanation was proposed it is clear that in order to get a better understanding it would be useful to examine other cases in which a death spike is not followed by a birth dip. for that purpose the normal procedure is to propose predictions in the hope that for some of them the expected dip will not materialize. this strategy is very much in line with what was done in the development of physics. every time an expectation happened to be contradicted by observation, this was an opportunity for new progress. a well-known case was the non-observation of the aether by michelson and morley which led to the theory of relativity. there are two possible kinds of predictions. • the first kind consists in what we may call standard predictions; they are instances very similar to those already analyzed but less well known because of smaller amplitude. the predictions of birth troughs in the netherlands in 1920-1922 and in chile in 1923 are of this kind. both graphs are shown in appendix e which is included in the arxiv version of this paper. although in the case of chile we do not yet know the reason of the death spike of july-august 1923, the birth trough could indeed be observed in the month in which it was expected. • in the second kind of predictions one deliberately considers a cause of death that has not yet been tested. one case of that kind are deaths due to heat waves. for instance in france in august 2003 a heat wave caused 13,700 excess deaths. this was a fairly exceptional case. most other heat waves caused only of the order of one or a few thousands excess deaths. in contrast with diseases, for heat waves there is no contagion effect; in contrast with earthquakes there are no collateral destructions. however, in contrast with 9/11, all persons (at least those who do not have air conditioning) are directly affected to some degree. are there birth dips in the wake of heat waves? because of the fairly low amplitude of most death spikes the identification of the troughs turns out to be more difficult but they are nevertheless present [21, 22] . ] ∆p in the case of a death spike the population change will be given by the excess-death number ∆p = −[d(t) − b(t)] = −e (for the sake of simplicity we ignore the 9-month time-lag between conception and birth). this leads to the proposition given in the text. as a case in point in order to illustrate the previous argument with real data we consider again sweden during the influenza epidemic of 1918. it is by purpose that we selected a country which did not take part in the first world war so as to avoid any interference. the data are taken from [11, p. 313, 438] and [12, p. 73 ]. in early 1918 the swedish population numbered 5.8 millions; on average its annual birth and death rates were 2.0% and 1.3% respectively. in other words, the birth number reduction cannot simply be a ''mechanical'' consequence of the death spike. it can only be explained by a drastic reduction in the birth rate. average time interval between intercourse conducive to conception and birth. the familiar estimate of 9 (mean) months corresponds to an interval of 274 days, that is to say one week longer than the (more accurate) estimate given in the figure. the standard 280 day figure refers to the time interval between the last menstrual period and birth; it overestimates the time between sexual intercourse and birth by two weeks. the 9 day dispersion of ovulation refers to its standard deviation. source: wilcox et al. [23, 24] , bhat and kushtagi [25] . each duration, which gives an average of d = 267. the distribution of the fertility window is approximately gaussian 10 with a standard deviation of 9 days. in summary: d = 267 ± σ , σ = 9 days. this estimate is confirmed by the following result based on a sample of 125 pregnancies in which ''the median time from ovulation to birth was 268 days'' [26] . remark. in the previous results ovulation time was determined by urinary hormone measurements or estimated through ultrasound observation performed later on in the pregnancy. needless to say each of these methods involve some uncertainty. it might seem that medically assisted conception would afford a direct and henceforth more accurate method. the difficulty here is that such pregnancies are know to lead to an inflated proportion of preterm deliveries. cycle day of insemination, coital rate and sex-ratio the inconstancy of human sex ratios at birth. eletters to the editor sex ratio at birth and war in croatia the variations of human sex ratio at birth during and after wars, and their potential explanations offspring sex ratios at birth as markers of paternal endocrine disruption sex differences in measles mortality: a world review sexist diseases can quantum-mechanical description of physical reality be considered complete? influence des épidémies de grippe sur la fécondité the influenza epidemic in paris and in some other cities in western europe le mouvement naturel de la population dans le monde de 1906 à 1936 state, economy, and society in western europe 1815-1975. a data handbook in two volumes famine amenorrhea (seventeenth to twentieth century) biologie des menschen in der geschichte. beiträge zur socialgeschichte der neuzeit aus frankreich or skandinavia les crises de subsistence et la démographie dans la france d'ancien régime the french subtitle of this official publication is: éléments démographiques principaux de la finlande pour les années 1750-1890, ii: mouvement de la population. the finnish title is: suomen [finland] väestötilastosta the swedish title of this official publication is: befolknings-statisk [population statistics] ii. 1. underdåniga berättelse [report to the king] för åren 1856 med 1860 statistique internationale du mouvement de la population d'après les registres d'état civil. résumé rétrospectif depuis l'origine des statistiques de l'état civil jusqu'en 1905 epidemiology of sars in the 2003 hong kong epidemic évolution de la saisonnalité des naissances en france de 1975 à nos jours [changes in the seasonal birth pattern in france from vagues de chaleur, fluctuations ordinaires des températures et mortalité en france depuis 1971 [heat waves, temperature fluctuations and mortality in france from timing of sexual intercourse in relation to ovulation. effects on the probability of conception, survival of the pregnancy, and sex of the baby the timing of the ''fertile window'' in the menstrual cycle: day specific estimates from a prospective study a re-look at the duration of human pregnancy length of human pregnancy and contributors to its natural variation we wish to express our sincere thanks to ms. ela klayman-cohen of the swedish ''statistical central agency'' (statistika central-byråns), ms. maija maronen of ''statistics finland'', mr. chihiro omori of the japanese ministry of internal affairs for their kind help in guiding us through the rich datasets of their respective countries. in this appendix we examine the implication of a constant birth rate on population and birth changes.the monthly birth rate is defined as: λ = b(t)/p(t) where b(t) is the number of births in month t. now let us apply changes ∆b and ∆p to the numerator and denominator respectively. how these changes must be connected if λ is to remain constant is shown by the following calculation. an explanation based on the number of marriages may be considered but in fact it can be quickly discarded. one may say that during the time of the epidemic people postponed planned marriages. as in sweden in any normal month there were about 4000 marriages a drastic reduction could in principle account nine months later for a fall in births of the same magnitude, that is to say of the size actually observed. however, it appears that in most countries there was only a slight reduction in the number of marriages or even none at all. thus, in sweden in october 1918, at the height of the influenza epidemic, there were 4323 marriages compared to an average of 4255 in the same month of october 1915 october , 1916 october , 1917 even when there is a fall in marriages it has not necessarily an impact on the number of births 9 months later. an illustration is provided by new york city in september 2001. as a result of the world trade center attack the number of marriages fell from 6753 in august 2001 to 2616 in september. however the correlation between the time series of monthly percentage variations of marriages and conceptions over the 60 months from 1999 to 2003 is as low as 0.025 (which, for a confidence level of 0.95, is not a significant correlation). the corresponding linear regression reads: ∆b/b = 0.003(∆m/m) + 0.11 which shows that the observed fall of 61% of marriages will translate in a fall of less than 1% for the births.in short, in some cases where there is both a substantial reduction in the number of marriages (which is rare) and a significant marriage-birth correlation, this factor may contribute but it cannot be the root factor that we are looking for. the standard length of pregnancy, namely 40 weeks (280 days) is counted from the woman's last period, not the date of conception which generally occurs two weeks later. for the present study we rather need the time d between the sexual intercourse which led to the conception and birth. the conception results from the encounter between a spermatozoon and an ovum (also called ovule or egg cell). ovulation means that the egg is released from the ovary into the fallopian tube; it remains there in good shape for only one day which means that conception and ovulation must take place almost on the same day. in contrast the spermatozoa can stay alive in the fallopian tube for 3 days with similar conception probabilities during those 3 days. 9 once fertilized, the egg starts its journey down the fallopian tube and into the uterus where it will get implanted (see fig. c.1) .the key-question then is: when does ovulation occur? it is often said that it occurs at day 14 in the menstrual cycle. in fact, this depends upon the length of the menstrual cycle [24] . when the length of the cycle is 28 days, ovulation occurs on average at day 12. when the cycle lasts more than 30 days, the ovulation takes place on day 14. the global average for all cycles is 14 days. coupled with the standard estimate of 280 days starting from the beginning of the cycle one gets: 280 − 14 = 266 days following ovulation. with respect to intercourse one gets time intervals of d = 266,267,268 days with same probability for 9 actually they can remain alive for about 6 days but the probability of conception during these three extra days is only one third of the conception probability during the first 3 days [23] . key: cord-319860-zouscolw authors: wu, jianhua; mamas, mamas a; mohamed, mohamed o; kwok, chun shing; roebuck, chris; humberstone, ben; denwood, tom; luescher, thomas; de belder, mark a; deanfield, john e; gale, chris p title: place and causes of acute cardiovascular mortality during the covid-19 pandemic date: 2020-09-28 journal: heart doi: 10.1136/heartjnl-2020-317912 sha: doc_id: 319860 cord_uid: zouscolw objective: to describe the place and causes of acute cardiovascular death during the covid-19 pandemic. methods: retrospective cohort of adult (age ≥18 years) acute cardiovascular deaths (n=5 87 225) in england and wales, from 1 january 2014 to 30 june 2020. the exposure was the covid-19 pandemic (from onset of the first covid-19 death in england, 2 march 2020). the main outcome was acute cardiovascular events directly contributing to death. results: after 2 march 2020, there were 28 969 acute cardiovascular deaths of which 5.1% related to covid-19, and an excess acute cardiovascular mortality of 2085 (+8%). deaths in the community accounted for nearly half of all deaths during this period. death at home had the greatest excess acute cardiovascular deaths (2279, +35%), followed by deaths at care homes and hospices (1095, +32%) and in hospital (50, +0%). the most frequent cause of acute cardiovascular death during this period was stroke (10 318, 35.6%), followed by acute coronary syndrome (acs) (7 098, 24.5%), heart failure (6 770, 23.4%), pulmonary embolism (2 689, 9.3%) and cardiac arrest (1 328, 4.6%). the greatest cause of excess cardiovascular death in care homes and hospices was stroke (715, +39%), compared with acs (768, +41%) at home and cardiogenic shock (55, +15%) in hospital. conclusions and relevance: the covid-19 pandemic has resulted in an inflation in acute cardiovascular deaths, nearly half of which occurred in the community and most did not relate to covid-19 infection suggesting there were delays to seeking help or likely the result of undiagnosed covid-19. cardiovascular disease (cvd) is one of the most prevalent underlying condition associated with increased mortality from covid-19 infection. [1] [2] [3] [4] [5] yet, we and others have shown a substantial reduction in presentations to hospitals with acute cardiovascular (cv) conditions including acute coronary syndrome, heart failure, cardiac arrhythmia and stroke during the pandemic. [6] [7] [8] [9] [10] [11] this would be expected to result in a much higher number of deaths, unless there has been an actual decrease in the incidence of these acute conditions. the detailed impact on mortality from acute cvd has, however, not been studied at national level. we now report, with high temporal resolution, cv-specific mortality during covid-19 in england and wales. in particular, we have evaluated the location of cv deaths (eg, hospitals, home or care homes), their relation to covid-19 infection and the specific cv fatal events that contributed directly to death. this information is vital for the understanding of healthcare policy during the pandemic and to assist governments around the world reorganise healthcare services. the analytical cohort included all certified and registered deaths in england and wales ≥18 years of age, between 1 january 2014 and 30 june 2020 recorded in the civil registration deaths data of the office for national statistics (ons) of england and wales. 12 the primary analysis was based on any of the 10th revision of the international statistical classification of diseases and related health problems (icd-10) codes corresponding to the immediate cause of death and contributed causes registered, as stated on the medical certificate of cause of death (mccd). the mccd is completed by the doctor who attended the deceased during their last illness within 5 days unless there is to be a coroner's postmortem or an inquest. cv events directly leading to death (herein called acute cv deaths) were categorised as acute coronary syndrome (st-elevation myocardial infarction (stemi), non-stemi, type 2 myocardial infarction, reinfarction) abbreviated as acute coronary syndrome, heart failure, cardiac arrest, ventricular tachycardia (vt) and/ or ventricular fibrillation (vf), stroke (acute ischaemic stroke, acute haemorrhagic stroke, other non-cerebral strokes, unspecified stroke), cardiogenic shock, pulmonary embolism, deep venous thrombosis, aortic disease (aortic aneurysm rupture and aortic dissection) and infective endocarditis (online supplemental table 1). icd-10 codes 'u071' (confirmed) and 'u072' (suspected) were used to identify whether a death was related to covid-19 infection on any part of the mccd. the place of death as recorded on the mccd was classified as home, care home and hospice, and hospital. baseline characteristics were described using numbers and percentages for categorical data. data healthcare delivery, economics and global health were stratified by covid-19 status (suspected or confirmed, not infected), age band (<50, 50-59, 60-69, 70-79, 80+ years)), sex and place of death (home, hospital, care home or hospice). the number of daily deaths was presented using a 7-day simple moving average (the mean number of daily deaths for that day and the preceding 6 days) from 1 february 2020 up to and including 30 june 2020, adjusted for seasonality. the expected daily deaths from 1 february 2020 up to and including 30 june 2020 were estimated using farrington surveillance algorithm for daily historical data between 2014 and 2020. 13 the algorithm used overdispersed poisson generalised linear models with spline terms to model trends in counts of daily death, accounting for seasonality. the number of non-covid-19 cv deaths each day from 1 february 2020 were subtracted from the estimated expected daily deaths in the same time period to create a zero historical baseline. deaths above this baseline maybe interpreted as excess mortality, which were calculated as the difference between the observed daily deaths and the expected daily deaths. negative values, where the observed deaths fell below the expected deaths, were set to zero. the rate of excess deaths was derived from dividing excess mortality by the sum of the expected deaths between 2 march 2020 and 30 june 2020. for the categories of acute cv death, the icd-10 code on the mccd was counted only once per deceased. thus, the overall rate of acute cv death represents the number of people with a direct cv cause of death. given that, people may have more than one of the predefined cv events leading to death, analyses for each of the predefined cv categories represent the number of events (not people) per category. for the purposes of this investigation, cvd that contributed, but did not directly lead to death were excluded from the analyses. all tests were two sided and statistical significance considered as p<0.05. statistical analyses were performed in r v.4.0.0, and the farrington surveillance algorithm was fitted using r package 'surveillance'. between 1 january 2014 and 30 june 2020, there were 3 450 381 deaths from all causes among adults. the proportion of deaths increased with increasing age band and there were 1 752 908 (50.8%) in women (table 1) . people dying from any of the directly contributing cv categories accounted for 587 225 (17.0%) of all deaths, of which 6.0% had at least two of the predefined cv categories that directly contributed to death. most deaths occurred in hospital (63.0%) followed by home (23.5%) and at care home (13.5%). following the first uk death from covid-19 on 2 march 2020, there were 28 969 acute cv deaths of which 5.1% related to covid-19 (7.9% suspected; 92.1% confirmed), and an excess acute cv mortality of 2085 (a proportional increase of 8%) compared with the expected historical average in the same time period of the year. covid-19 deaths accounted for 1307 (71.0%) of all excess deaths after this date (figure 1, , table 2) . compared with deaths prior to 2 march 2020, covid-19 related acute cv deaths were more likely to occur in hospital (81.1% vs 63.0%), much less at home (7.1% vs 23.5%) and remained of similar proportions to non-covid-19-related acute cv deaths in care homes (13.5% vs 11.8%). the rate of covid-19-related excess cv deaths was higher in hospitals than care homes (a proportional increase of 7% vs +5%) and less at home (a proportional increase of 2%). excess covid-19related acute cv deaths occurred in similar proportions for men and women (a proportional increase of 6% vs 5%), and the rate of excess covid-19-related acute cv deaths was comparable across the age bands (table 2) . the greatest proportional increase of excess covid-19-related acute cv death was due to pulmonary embolism (251, a proportional increase of 11%) followed by stroke (562, a proportional increase of 6%), acute coronary syndrome (318, a proportional increase of 5%), cardiac arrest (93, a proportional increase of 6%) and heart failure (273, a proportional increase of 4%) (figure 2, table 2). the most frequent causes of excess acute cv death in care homes and hospices were stroke (715, a proportional increase of 39%) and heart failure (227, a proportional increase of 25%), which compared with acute coronary syndrome (768, a proportional increase of 41%) and heart failure (734, a proportional increase of 33%) at home, and pulmonary embolism (155, a proportional increase of 13%) and cardiogenic shock (55, a proportional increase of 15%) in hospital ( figure 3, table 3 ). for stroke, acute coronary syndrome, heart failure and cardiac arrest, the numbers of deaths in hospital were lower than the historical baseline ( figure 3) . we show for the first time, in a nationwide complete analysis of all adult deaths, the extent, site and underlying causes of the increased acute cv mortality during the covid-19 pandemic compared with previous years. this shows that the pandemic has resulted in an abrupt inflation in acute cv deaths above that expected for the time of year. nearly half of the deaths occurred outside of the hospital setting, either at home or in care homes, with people's homes witnessing the greatest proportional increase in excess acute cv deaths. the most frequent cause of acute cv death during the covid-19 pandemic in england and wales was stroke followed by acute coronary syndrome and heart failure. this is key information to optimise messaging to the public, as well as for allocation of health resources and planning. numerous international studies have reported the decline in hospital presentations for a range of cv emergencies. [6] [7] [8] [9] [10] [11] to the best of our knowledge, this is the first study to show that this is associated with an adverse overall cv impact. while stroke and acute coronary syndrome accounted for the vast majority of acute cv deaths, the number of deaths in hospital due to these conditions fell below that expected for the time of year and it increased in the community, and particularly in people's homes. this 'displacement of death', most likely, signifies that the public either did not seek help or were not referred to hospital during the pandemic-a finding supported by the fact that the majority of acute cv deaths were not recorded as related to infection with covid-19. given the times series plots show that the excess in acute cv mortality began in late march 2020 and peaked in early april 2020, government directives at the time including the onset of the uk lockdown on 23 march 2020 could have accentuated a maladaptive public response. the major causes of acute cv death were different between hospital and community settings. this 'differential of cause of death by place' provides an understanding of how the infection and public response to the pandemic played out. the most frequent cause of excess acute cv death in people's homes was acute coronary syndrome, in care homes and hospices it was stroke and in hospital it was pulmonary embolism. assuming that the public did not seek help for medical emergencies for fear of contagion in hospital or to prevent hospitals being overwhelmed, then it is not surprising that there were deaths from acute coronary syndrome at home. complications of untreated acute myocardial infarction include cardiac arrest, arrhythmia and acute heart failure. we found that in people's homes there were increases in excess acute cv deaths from cardiac arrest (in line with others' findings 14 ) and heart failure, and in hospitals there were increases in excess deaths from cardiogenic shock and vt and vf-all of which are complications of late presentation myocardial infarction. in hospital, we also found an inflation of deaths from infective endocarditis and aortic dissection and rupture, indicating perhaps a more advanced (and for some, irreversible) stage of disease presentation during the pandemic, akin to the situation with acute myocardial infarction. care homes and hospices witnessed a substantial increase in excess acute cv deaths. herein, stroke, heart failure, acute coronary syndrome and pulmonary embolism were the the most common causes of acute cv death. this finding highlights the susceptibility of the elderly and comorbid to the wider implications of covid-19 crisis. that is, not only were care home residents prone to the respiratory effects of covid-19 infection but they will also have been exposed to the acute cv complications of covid-19 and decisions not to go to hospital for fear of becoming infected. this situation will have been exacerbated by several factors, including the discharge of unknowingly infected patients from hospitals to care homes early in the course of the pandemic (where the virus could easily spread 15 and actions to reduce the spread of the virus in social care were too late and insufficient 16 ), a lack of systematic antibody testing for the sars-cov-2 virus, the efficient person-to-person transmission of the virus, and its propensity to death in the vulnerable. 1 17 while previous reports have described an elevated risk of death among the elderly and people with cv disease during the covid-19 pandemic, none have characterised the cv events directly leading to death and few quantified the excess in acute cv mortality. 1 3 18 to date, insights have been derived from small series of cases, regional or national death records data-each reporting elevated mortality rates, but none by the type and place of cv death together. 1 2 18-22 the unique strengths of this investigation include full population coverage of all adult deaths across places of death. most previous reports have been confined to hospitals deaths and have not captured the full extent of the impact of the pandemic, including deaths figure 2 time series of acute cardiovascular (cv) deaths by covid-19, by cause of death. the number of daily cv deaths is presented using a 7day simple moving average (indicating the mean number of daily cv deaths for that day and the preceding 6 days) from 1 february 2020 up to and including 30 june 2020, adjusted for seasonality. the number of non-covid-19 excess cv deaths each day from 1 february 2020 were subtracted from the expected daily death estimated using farrington surveillance algorithm in the same time period. the green line is a zero historical baseline. the red line represents daily covid-19 cv death from 2 march to 30 june 2020; the purple line represents excess daily non-covid-19 cv death from 2 march to 30 june 2020 and the blue line represents the total excess daily cv death from 1 february to 30 june 2020. vf, ventricular fibrillation; vt, ventricular tachycardia. time series of acute cardiovascular (cv) deaths by cause of death and place of death. the number of daily cv deaths is presented using a 7-day simple moving average (indicating the mean number of daily cv deaths for that day and the preceding 6 days) from 1 february 2020 up to and including 30 june 2020, adjusted for seasonality. the number of non-covid-19 excess cv deaths each day from 1 february 2020 were subtracted from the expected daily death estimated using farrington surveillance algorithm in the same time period. the green line is a zero historical baseline. the red line represents excess daily death at hospital; the purple line represents excess daily cv death at care home and hospice and the blue line represents excess daily cv death at home. vf, ventricular fibrillation; vt, ventricular tachycardia. outside of hospitals in people who may not have been tested for the disease. nonetheless, our study has limitations. during the covid-19 pandemic, emergency guidance enabled any doctor in the uk (not just the attending) to complete the mccd, the duration of time over which the deceased was not seen before referral to the coroner was extended from 14 to 28 days, and causes of death could be 'to the best of their knowledge and belief ' without diagnostic proof, if appropriate and to avoid delay. 23 this may have resulted in misclassification bias, with under-reporting of the deaths directly due to cv disease in preference to covid-19 infection (which is a notifiable disease under the health protection (notification) regulations 2010) or respiratory disease. in fact, we found that mccds with covid-19 certification less frequently contained details of acute cv events directly leading to death. although the mccd allows the detailing of the sequence of events directly leading to death, we found that after 2 march 2020 few (5.7%) had multiple acute cv events recorded, and therefore the categorisation of the acute cv events effectively represents per-patient events. the lower proportion of deaths with covid-19 at home and in care homes may represent the lack of access to community-based covid-19 testing. equally, because there was no systematic testing of the uk populace for the presence the covid-19, deaths associated with the infection may have been underestimated. 24 this analysis will have excluded a small proportion of deaths under review by the coroner, though typically these will have been unnatural in aetiology. in addition, we did not include the spatial information in the farrington surveillance algorithm, which may affect the accuracy of the estimates for the expected death. to date, there is no whole-population, high temporal resolution information about acute cv-specific mortality during the covid-19 pandemic. through the systematic classification of all adult deaths in england and wales, it has been possible to show that there has been an excess in acute cv mortality during the covid-19 pandemic, seen greatest in the community and which corresponds with the onset of public messaging and the substantial decline in admissions to hospital with acute cv emergencies. cardiogenic shock 0 (0%) 0 (0%) 0 (0%) 0 (0%) 14 (+4%) 55 (+15%) vt and vf 0 (0%) 1 (+3%) 0 (0%) 0 (0%) 4 (+2%) 17 (+9%) deep vein thrombosis 0 (0%) 2 (+7%) 0 (0%) 0 (0%) 2 (+3%) 5 (+7%) *the positive excess rate in hospital was due to setting those daily deaths below the expected historical average to zeros. cv, cardiovascular; vf, ventricular fibrillation; vt, ventricular tachycardia. what is already known on this subject? ► cardiovascular disease is one of the most prevalent underlying conditions associated with increased mortality from covid-19 infection, along with dementia and alzheimer's disease. ► at the same time, there has been a substantial reduction in presentations to hospitals with acute cardiovascular (cv) conditions. ► our study of all adult deaths in england and wales between 1 january 2014 and 30 june 2020 has quantified the cv mortality impact of the covid-19 pandemic, be this related to contagion and/or the public response. ► it shows that during the pandemic there has been an inflation in acute cv deaths above that expected for the time of year. ► home death had the greatest increase in excess acute cv death, and the most frequent cause of acute cv death during this period was stroke, followed by acute coronary syndrome. how might this impact on clinical practice? ► these contemporary nationwide cause and place of mortality data provide key information to optimise messaging to the public, as well as for allocation of health resources and planning. digital with the mortality data and takes responsibility for the integrity of these data. the programme was endorsed the british heart foundation collaborative, which also includes health data research uk, hsc public health agency, national institute for cardiovascular outcomes research, cancer research uk, public health scotland, nhs digital, sail databank and uk health data research alliance. contributors cpg and jw was responsible for the study design and concept. jw performed the data cleaning and analysis. jw and cpg wrote the first draft of the manuscript, and all authors contributed to the writing of the paper. funding jw and cpg are funded by the university of leeds. mam is funded by the university of keele. competing interests none declared. patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research. patient consent for publication not required. ethics approval ethical approval was not required as this study used fully anonymised routinely collected civil registration deaths data. the data analysis was conducted through remote access to nhs digital data science server. provenance and peer review not commissioned; externally peer reviewed. estimating excess 1-year mortality associated with the covid-19 pandemic according to underlying conditions and age: a populationbased cohort study case-fatality rate and characteristics of patients dying in relation to covid-19 in italy features of 20 133 uk patients in hospital with covid-19 using the isaric who clinical characterisation protocol: prospective observational cohort study pre-existing conditions of people who died with covid-19 peop lepo pula tion andc ommunity/ birt hsde aths andm arriages/ deaths/ articles/ anal ysis ofde athr egis trat ions noti nvol ving coro navi rusc ovid 19en glan dand wale s28d ecem ber2 019t o1ma y2020/ technicalannex# characteristics the covid-19 pandemic and the incidence of acute myocardial infarction collateral effect of covid-19 on stroke evaluation in the united states emergency hospital admissions and interventional treatments for heart failure and cardiac arrhythmias in germany during the covid-19 outbreak: insights from the german-wide helios hospital network covid-19 pandemic and admission rates for and management of acute coronary syndromes in england patient response, treatments and mortality for acute myocardial infarction during the covid-19 pandemic user guide to mortality statistics an improved algorithm for outbreak detection in multiple surveillance systems out-of-hospital cardiac arrest during the covid-19 outbreak in italy let's be open and honest about covid-19 deaths in care homes staggering number" of extra deaths in community is not explained by covid-19 sars-cov-2: virus dynamics and host response sex-differences in mortality rates and underlying conditions for covid-19 deaths in england and wales excess uk deaths in covid-19 pandemic top 50,000 real estimates of mortality following covid-19 infection covid-19) mortality rate use of all cause mortality to quantify the consequences of covid-19 in nembro, lombardy: descriptive study guidance for doctors completing medical certificates of cause of death in england and wales tackling uk's mortality problem: covid-19 and other causes acknowledgements we acknowledge the intellectual input of professor colin baigent, university of oxford. jw had full access to all the data in the study and takes responsibility for the accuracy of the data analysis. the ons provided nhs this article is made freely available for use in accordance with bmj's website terms and conditions for the duration of the covid-19 pandemic or until otherwise determined by bmj. you may use, download and print the article for any lawful, non-commercial purpose (including text and data mining) provided that all copyright notices and trade marks are retained. key: cord-301399-s2i6qfjn authors: rana, jamal s.; khan, sadiya s.; lloyd-jones, donald m.; sidney, stephen title: changes in mortality in top 10 causes of death from 2011 to 2018 date: 2020-07-23 journal: j gen intern med doi: 10.1007/s11606-020-06070-z sha: doc_id: 301399 cord_uid: s2i6qfjn nan trends in mortality rates due to leading causes of death reflect the medical, psychosocial, and economic well-being of a society, and a historical snapshot of such trends can help inform policies of the future. therefore, we examined changes in the number of deaths and age-adjusted mortality rates (aamr) attributed to the top 10 causes of death between 2011 and 2018, the last year we have data available from the centers for disease control and prevention. we chose 2011 as the start date because of earlier work showing a transition in 2011 in 2 of the top 10 causes of death (heart disease and stroke) from a long-term decline to increasing numbers of deaths since then. 1 the centers for disease control and prevention wide-ranging online data for epidemiologic research (cdc wonder) dataset was used to identify national changes in the number of deaths and aamr due to the top 10 underlying causes of death from january 1, 2011, to december 31, 2018. 2 the population projection was obtained from u.s. census data. 3 as of 2018, the top 3 causes of death were heart disease, cancer, and accidents ( table 1 ). the largest percentage decline for aamr occurred for cancer deaths (− 11.8%), and the greatest increase in aamr occurred for deaths due to alzheimer disease (+ 23.5%). aamr for influenza and pneumonia (− 5.1%) and chronic lower respiratory diseases (− 6.6%) declined. increases in aamr due to accidental deaths (+ 22.8%) and intentional self-harm (suicide) (+ 15.4%) were observed. even though the aamr declined for 7 of the 10 top causes of death, the number of deaths increased for all 10 of the leading causes. this is because the older (age ≥ 65 years) age group grew at a much more rapid rate than that of the younger (age < 65 years) (26.7% vs. 1.7%), while 70% or more of the deaths from 8 of these causes were concentrated in older (age ≥ 65 years) adults ( table 2) . important patterns of change in aamr in the past decade have been previously noted, from stalling of the decline in mortality due to heart disease 1 to decrease in life expectancy attributed to drug overdoses and suicides among young and middle-aged adults. 4 while interventions to prevent and treat coronary heart disease (chd) have been successful, with ageadjusted mortality rate decrease 14.9% in last decade, the worrisome plateau in the decline in heart disease mortality seems to be driven by an increase in mortality for heart failure (20.7%), with majority of deaths due to heart disease happening in the increasing aging population. 5 the largest percent decline during this time period of the study was noted for cancers. according to a recent report, this progress is driven by long-term declines in death rates for the 4 leading cancers, namely lung, colorectal, breast, and prostate cancers. that report also noted that over 2008-2017, reductions slowed for female breast and colorectal cancers and stopped for prostate cancer; in contrast, declines accelerated for lung cancer, which remains the biggest contributor of mortality among cancers. 6 it remains to be seen what the final death toll will be due to covid-19 in 2020. with more than 90,000 deaths by may, it has already surpassed the number of deaths attributed to all but the 6 of the leading causes of deaths in 2018 including influenza and pneumonia, the 8th highest cause of mortality in 2018. due to the direct and myriad of indirect consequences of this pandemic, mortality rankings due to top 10 causes noted in the current report may look very different in 2020. as noted, almost three-quarters of the deaths from 8 of these causes were concentrated in older (age ≥ 65 years) adults. further, the ≥ 65 years population is projected to increase by 39% from 52.4 million in 2018 to 73.1 million in 2030 3 so that the number of deaths from most of the 10 leading causes can be expected to increase unless more effective preventive and therapeutic interventions can be implemented. recent trends in cardiovascular mortality in the united states and public health goals underlying cause of death projected age and sex distribution of the population life expectancy and mortality rates in the united states association between aging of the us population and heart disease mortality from cancer statistics, 2020 publisher's note: springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord-180835-sgu7ayvw authors: kolic, blas; dyer, joel title: data-driven modeling of public risk perception and emotion on twitter during the covid-19 pandemic date: 2020-08-03 journal: nan doi: nan sha: doc_id: 180835 cord_uid: sgu7ayvw successful navigation of the covid-19 pandemic is predicated on public cooperation with safety measures and appropriate perception of risk, in which emotion and attention play important roles. signatures of public emotion and attention are present in social media data, thus natural language analysis of this text enables near-to-real-time monitoring of indicators of public risk perception. we compare key epidemiological indicators of the progression of the pandemic with indicators of the public perception of the pandemic constructed from approx. 20 million unique covid-19-related tweets from 12 countries posted between 10th march and 14th june 2020. we find evidence of psychophysical numbing: twitter users increasingly fixate on mortality, but in a decreasingly emotional and increasingly analytic tone. we find that the national attention on covid-19 mortality is modelled accurately as a logarithmic or power law function of national daily covid-19 deaths rates, implying generalisations of the weber-fechner and power law models of sensory perception to the collective. our parameter estimates for these models are consistent with estimates from psychological experiments, and indicate that users in this dataset exhibit differential sensitivity by country to the national covid-19 death rates. our work illustrates the potential utility of social media for monitoring public risk perception and guiding public communication during crisis scenarios. the covid-19 pandemic has brought about widespread disruption to human life. in many countries, public gatherings have been broadly forbidden, mass restrictions on human movement have been introduced, and entire industries have been paralysed in attempting to lower the peak stress on healthcare systems [1] . however, the degree to which these restrictions have been enforced by law has varied over time and by location, and their success in mitigating public health risks depends on the extent of cooperation on the part of the public. a key determinant of the public's behaviour and their cooperation with state-imposed social restrictions is the public's emotional response to, and their perception of the the risk presented by, the pandemic. however, the evolution of emotions and risk perception in response to disasters is not well-understood, and there is a need for more longitudinal data on such responses with which this understanding can be improved [2] . our goal is thus to contribute to bettering this understanding, and we do so by exploring the empirical relationships present between the progression of the covid-19 pandemic and the public's perception of the risk posed by the pandemic. we explain our findings in terms of the existing body of literature surrounding public perception of risk, disasters, and human suffering in cognitive psychology. in particular, we draw from psychophysics, the field that studies the relationship between stimulus and subjective sensation and perception [3] . the search for psychophysical "laws" of perception has existed since at least the mid-19th century with the proposing of the weber-fechner law [4] , which posits that the smallest perceptible change ds in a physical stimulus of magnitude s is proportional to s. thus, the perceived magnitude p of such stimuli follows dp ∝ ds s . (1) in the continuum limit, this implies that p grows logarithmically with the physical magnitude s of the stimulus. more recently, empirical studies by s. s. stevens [5] supported, instead, a power law relationship between human perception of a stimulus and the physical magnitude of the stimulus: p ∝ s β . summers et al. [6] extended this concept to human sensitivity to war death statistics and found that a power law with exponent β = 0.32 best fit the data. a number of further studies have corroborated the extension of these psychophysical laws describing the subjective perception of physical magnitudes to the subjective evaluations of human fatalities [7, 8, 9] . in all of these, perception is a concave function of the stimulus, meaning that the larger the stimulus magnitude, the more it has to change in absolute terms to be equally noticeable. thus, perception is considered relative rather than absolute, implying that our judgments are comparative in nature. this observation has been shown to account for deviations from rationality in economic decision-making [10] . these proposed psychophysical laws of human perception present an opportunity for monitoring a population's response to a disaster scenario such as the covid-19 pandemic. by evaluating the goodness of fit of these models to data on the perception of the progression of the pandemic, and determining the parameter values of such fits, we can describe the sensitivity of populations to the state of such crises, with important implications for risk communication and disaster management. to this end, we make use of a massive twitter dataset consisting of user-posted textual data to study the public's emotional and perceptual responses to the current public health crisis. twitter provides convenient access to the conversation amongst members of the public across the globe on a plethora of topics, and many authors are studying several aspects of the public's response to the pandemic with it. twitter is a particularly appropriate tool under conditions of physical distancing requirements and furlough schemes, where online communication has become more than ever a central feature of everyday life. moreover, results from psycholinguistics and advances in natural language processing techniques enable the extraction of psychologically meaningful attributes from textual data. with this dataset, our general approach is to offer a quantitative, spatiotemporal comparison between indicators of the state of the pandemic and the topics and psychologically meaningful linguistic features present in the discussion surrounding covid-19 on social media on a country-by-country basis, for a selection of countries. our work is novel in that, to our knowledge, it is the first to use a large social media dataset spanning multiple countries to model the perceptual response of countries' citizens to the pandemic in the context of risk perception. to date, empirical validation of the aforementioned psychophysical laws has largely taken place in controlled laboratory settings, in which decisions, actions, and scenarios are artificial or hypothetical. our work thus contributes to the body of literature surrounding risk perception by investigating these laws in a naturalistic setting. however, there have been numerous authors using social media to analyse the public response to the covid-19 pandemic. this includes work that has focused on the psychological burden of the social restrictions. for instance, stella et al. [11] use the circumplex model of affect [12] and the nrc lexicon [13] to give a descriptive analysis of the public mood in italy from a twitter dataset collected during the week following the introduction of lockdown measures. in addition, venigalla et al [14] has developed a web portal for categorising tweets by emotion in order to track mood in india on a daily basis. others have instead focused on negative emotions, as in the work of schild et al. [15] , where they study the rise of hate speech and sinophobia as a result of the outbreaks. more specifically on perception, dryhust et al. [16] measured the perceived risk of the covid-19 pandemic by conducting surveys at a global scale (n ∼ 6000) and compared countries, finding that factors such as individualistic and pro-social values and trust in government and science were significant predictors of risk perception. de bruin and bennett [17] perform similar work in the united states. the closest work we have been able to find to our own is that of barrios and hochberg [18] , in which the authors combine internet search data with daily travel data to show that regions in the united states with a greater proportion of trump voters exhibit behaviours that are consistent with a lower perceived risk during the covid-19 pandemic. despite the above, we have been unable to find work that combines large-scale social media data with linguistic analysis to offer a spatiotemporal, quantitative analysis of emotion and risk perception during the covid-19 pandemic across multiple countries. beyond the covid-19 pandemic, our work is related to a small but growing body of literature on the use of data science in understanding human emotion and risk perception. in such work, natural language analysis has succeeded in supporting established linguistic theories such as the importance of the distribution of words in a vocabulary as a proxy for knowledge [19] , and regarding the relation between the uncertainty of events and the emotional response to their outcome [20, 21] . for instance, using textual data from twitter, bhatia found that unexpected events elicit higher affective responses than those which are expected [22] . in another instance, the same author conducted experiments with 300 participants and predicted the perceived risk of several risk sources using a vector-space representation of natural language, concluding that the word distribution of language successfully captures human perception of risk [23] . similar work has been conducted by jaidka et al. [24] in the area of monitoring public well-being, in which they compare word-based and data-driven methods for predicting ground-truth survey results for subjective well-being of us citizens on a county-level basis using a 1.5 billion tweet dataset constructed from 2009 to 2015. the remainder of this paper is laid out as follows. in section 2, we present the data set used in the subsequent analysis. in section 3, we provide further details on the approach followed to explore the relationships between indicators of the state of the pandemic and the public's perception of the pandemic, and discuss possible explanations for our observations by drawing on psychological literature. in section 4, we summarise and offer concluding remarks, along with a discussion of the limitations of the current work and suggestions for avenues of future work. in the following analysis, we make use of the set of tweets gathered by j. banda et. al [25] , which are obtained and mantained using the twitter free stream api 1 . at the time of writing, this data set consists of ∼ 80 million original tweets spanning from march 11, 2020 to june 14, 2020. data is collected according to the following query filters 2 : "covid19", "coronavirus-pandemic", "covid-19", "2019ncov", "coronaoutbreak", "coronavirus" , "wuhanvirus", "covid19", "coronaviruspandemic", "covid-19", "2019ncov", "coronaoutbreak", "wuhanvirus". for our analysis, we consider only the english and spanish tweets with a non-empty selfreported location field. we process every self-reported location using openstreetmaps [26] and remove non-sensical locations (e.g. "mars", "everywhere", "planet earth"). this allows us to group the remaining tweets by country and proceed with our analysis on a country-by-country 1 the free stream api randomly samples around 1% of the total tweets for the given queries 2 a number of publicly available twitter datasets have emerged in relation to the pandemic. we chose to work with this dataset since it used the most generic query terms among all the publicly available datasets we considered, and we wanted the least amount of bias possible for our analysis. basis. to assure the statistical significance of our analysis, we keep the countries with the highest number of tweets for each language, resulting in a geolocated twitter dataset of ∼ 20 million tweets posted by ∼ 4 million users on 12 different countries, which we summarise in table 1 . we measure the progression of the pandemic with the number of covid-19 confirmed cases and deaths for all the countries in our analysis. the data was made publicly available by our world in data repository [27] . in particular, we take the daily covid-19 cases and deaths, both in linear and logarithmic scale, since these are four epidemiological indicators that are most frequently used to summarise the state of the pandemic, and are therefore frequently encountered by the public. in this section, we study the public's perception of the pandemic on a country-by-country basis, using the countries with the highest number of tweets in the observation period (see table 1 ). we do this on a country-by-country basis since the pandemic has often invoked nation-level responses, making nation-level analysis the most natural geographic scale. our broad approach is to inspect and compare the linguistic features of the tweets released by users in the twitter dataset described in section 2.1 with the epidemiological data described in section 2.2. our goal is to explore the public's perception of the pandemic. to do this, we analyse the linguistic features present in the textual data generated by twitter users, and map these features to psychologically meaningful categories that are indicative of the twitter users' perception. here, we are assuming that the words used by these twitter users are indicative of their internal cognitive and emotional states [28] , which is supported in [23] where they predict the perception of risk using text data. thus, we quantify the linguistic content of each tweet using the linguistic inquiry and word count (liwc) program [29] . liwc has been widely adopted in several text data analyses, and it has proven successful in applications ranging from measuring the perception of emotions [30] to predicting the german federal elections using twitter [31] . liwc operates as text analysis program that reports the number of words in a document belonging to a set of predefined linguistically and psychologically meaningful categories 3 [28] . for our purposes, a document is a tweet d t i posted on date t and from a user based in country i. liwc represents documents as an unordered set of words, and a liwc category l is similarly a set of words associated with concept l. for a given document d t i , the linguistic score p l for category l is the percentage of words in d t i that belong to l: there are many such categories l, including family, work, and motion. we capitalise such category titles, and use the titles to refer to either the set of words associated with that category or to refer to the category itself. linguistic scores from eq. (3) for individual tweets will be noisy, as they are short documents. moreover, we are interested in the average response of the population of a country. for this reason, we group the tweets by country i and by date t, and denote these sets of tweets as we then compute the national linguistic score (nls) for category l as the average of the linguistic scores over documents in d t i relative to an empirically observed twitter base rate p l b : the base rates p l b for the use of words on twitter associated with category l are given in [29] . using eq. (4) for all the selected linguistic categories, we construct multidimensional country-level time series that represent the evolution of the public perception of the pandemic, similar to the linguistic profiles introduced by tumasjan et al. [31] . in figure 1 , we show the collection of nlss for a selection of relevant linguistic categories. we observe clear trends that, in most cases, are synchronized between countries and languages. in particular, most categories associated with emotion -notably affect, anger, anxiety, positive emotion, negative emotion, and swear words (swearing is associated with frustration and anger [32] ) -have their highest scores in mid-to-late march, when the world health organisation (who) announced the pandemic status of covid-19 and most western countries introduced more stringent social restrictions [1] . these scores decay thereafter, indicating a relaxation of the emotional response in the conversation. this is consistent with results reported by bhatia regarding the affective response to unexpected events [22] . a qualitatively similar trend can be seen in the social processes panel, the category involving "all non-first-person-singular personal pronouns as well as verbs that suggest human interaction (talking, sharing)" [29] . we also observe that health-related categories such as death and health show an overall rising trend, with death rising most rapidly throughout march. these categories, with the exception of positive emotion and health, peak again in the united states at the end of may, coinciding with the murder of george floyd and the subsequent black lives matter protests. such universal trends are not apparent by visual inspection in the money, risk, and sadness panels. an additional feature of these plots is the absolute scale of these values: in all cases, there is a significant percentage change from their baseline values, with large percentage increases observed initially in the use of words associated with anxiety and later with death, and a moderate percentage increase in the use of words associated with risk. in this section, we explore the relationship between the nlss described in section 3.1, which we use as a proxy for the public's perception, and the intensity of the pandemic, which we assume is the stimulus triggering this perception. our measure of the intensity of the pandemic is the number of covid-19 cases and deaths from the data described in section 2.2. a straightforward way of approaching this relationship is by computing the correlations between the nlss and the epidemiological data in a per-country basis, and we show the average across countries of these per-country correlations in figure 2 . on the one hand, we observe significant negative correlations in emotionally charged categories (eg. swear words, anger, anxiety, affective processes), indicating a decay in emotion as the pandemic intensifies. conversely, categories related with health and mortality (death, health) and analytical thinking (analytic) show significant positive correlation 4 . we believe the trends we observe in fig. 1 and the correlations we observe in fig. 2 are consistent with the notion of psychophysical numbing. this term was introduced by robert jay lifton [33] , and developed by paul slovic [7, 8] in the context of human perception of genocides and their associated death tolls, to describe the paradoxical phenomenon in which people exhibit growing indifference towards human suffering as the number of humans suffering increases. by inspecting the correlations between the nlss and the epidemiological indicators, we find that as the pandemic intensifies -in the sense of an increasing number of cases and deaths reported daily -our emotional response diminishes, as expected from a psychophysical numbing phenomenon. specifically, we observe negative correlations between almost all components of the nlss associated with affect -affective processes, anger, anxiety, negative emotion, positive emotion, and swear words -and the epidemiological data 5 . by inspecting figure 1 , we see that every country exhibits similar downward trends in these components and, with the exception of anxiety, are all significantly lower than their baseline values throughout the observation period. this unusually low and decreasing affect word count is accompanied, conversely, with a growing awareness of the morbidity of the situation in that we observe significant positive correlations between the death nlss and the daily national cases and deaths, indicating that the decrease in affect occurs simultaneously with and despite an attentional shift towards covid-19 related mortality. we also observe a simultaneous increase in the analytic component of each english-language dataset 6 over this same period, indicating a movement towards more logical and analytical, rather than intuitive and emotional, thinking. the potential implication of this is that the public is less perceptive of the risk that the pandemic poses to public health, since their emotional response is reduced and reducing [34] . 4 when analysing these correlations, we found that, overall, the cumulative cases and deaths correlate better with most linguistic categories than the daily data. however, while this is sensible in the early stages of the pandemic, it is unlikely to remain the case over a long time horizon due to humans' finite memory. we therefore proceeded with our comparison using the daily epidemiological data alone for this reason. 5 the only exception is the cross-country average of the sadness component of the nlss, which is positively correlated with the epidemiological indicators and appears to be driven only from argentina's, chile's, and colombia's increasing use of words related to sadness. the remaining countries remain stationary at a lowerthan-baseline value for this component. 6 unfortunately, the spanish liwc dictionary does not yet have an analytic category. for example, van bavel et al. [35] and loewenstein et al. [36] describe that risk perception is driven more by association and affect-based processes than analytic and reason-based processes, with the affect-based processes typically prevailing when there is disagreement between the two modes of thinking. the negative correlations between the intensity of the pandemic and affective processes, together with its positive correlation with the prevalence of analytic processes, suggests that public risk communication could be adjusted to re-balance the degree of affective and analytic thinking amongst members of the public to achieve favourable risk avoidance behaviour and, consequently, favourable public health outcomes. to support our claim that these observations are attributable to psychophysical numbing, we construct word co-occurrence networks using tweets in our dataset. given a set t of tweets, the word co-occurrence network g(t ) is represented by a weighted adjancency matrix a(t ) in which the nodes are words belonging to the death and affect liwc dictionaries. entry a ij (t ) counts the number of co-occurrences between words i and j across all tweets in t , and is computed as where b tk (t ) counts the number of instances of word k in tweet t ∈ t . we ignore self-edges by imposing a ii = 0, since it is the relationship between distinct words that is of interest. (see appendix b.1 for further details on the construction of these networks.) if the psychophysical numbing effect is legitimate, we expect that words in the death dictionary co-occur more frequently with other death-related words and less frequently with words in the affect dictionary. we construct three such networks by aggregating the word co-occurrences over three distinct periods: 11th march to 9th april 2020, 10th april to 23rd may, and 24th may to 13th june. as we discussed previously, the first period coincides with the pandemic status of covid-19 declared by the who and has a high affect score but a low and increasing death score; the second one has a high and relatively stable death score and a decreasing affect score; and the third has a high death score but one in which the affect scores and some of its subcategories (e.g. anger, anxiety, negative emotion) increase again, which we attribute -at least partly -to the public response to the murder of george floyd and the subsequent black lives matter protests. in constructing these networks, we weight each country equally by taking a random sample of approximately 300,000 tweets from each country. 11th march -9th april in this network (see fig. 3a ) we see two main clusters emerging. the first consists mostly of words associated with death (left), and the second of words associated with affect (right). the appearance of some of the affect-related words in the death cluster can be explained given the context of the pandemic. for example, the word "positive" is likely used in reference to the number of people that have tested positive for covid-19, which is closely related to the conversation around covid-19 cases and deaths. similarly, the word "panic" is likely reflecting the early conversations around panic-buying of household goods, for example toilet paper and hand-sanitiser, and the word "isolat*" is likely used in calls for symptomatic individuals to self-isolate. thus, while some instances of affect-related words that appear in this predominantly death-based community are harder to explain without appealing to the existence of a true subjective experience of affect amongst the twitter users (e.g. "risk*"), the most important (in terms of node degree) of these affect-related words are more likely being used here in an affect-free sense and are appropriately grouped with death-related words here given the context of the pandemic. thus the community structure we observe is consistent with our hypothesis of a separation between words belonging to the death and affect dictionaries. 10th april -23rd may in this network (see fig. 3b ), the two-cluster structure seen in the previous snapshot remains, with the cluster more centered on death on the left and a cluster corresponding to almost exclusive use of affect-related words on the right. the size of the death-related cluster has increased relative to the affect-based community, reflecting the higher death nls during this period. two new and important affect-related nodes appear in the death-based community for this time period: "care" and "fail*". these can once again be plausibly explained by the context of the pandemic. for example, the appearance of the word "care" in the death-related community can be explained in terms of the conversation surrounding the health care system and death care industries, the number of covid-19 patients being admitted to intensive care units, andparticularly for the united kingdom -the number of deaths that have occurred in care homes for the elderly. these are all clearly related to covid-19 deaths, and the word "care" in this context most likely constitutes part of the noun and topic of conversation rather than any expression of emotion. the word "fail*" could reflect the discussion around failures on the part of governments to respond with sufficient vigor to the public health crisis -e.g. in terms of a failure to impose social restrictions in a timely manner or to meet testing quotas or quotas on the provision of personal protective equipment for key workers. for example, the polling company yougov finds that approximately 50% of respondents felt during this period that the us and uk governments had been handling the pandemic well, and that these numbers decreased throughout this period to approximately 45% [37] . this does not however exclude the possibility that the appearance of "fail*" indicates a subjective emotional experience: it is possible that twitter users that fixate on government failures are doing so as a result of a sense of outrage with regard to these perceived failures. whether such outrage is motivated specifically by the human fatalities themselves or is merely a manifestation of broader political hostilities and polarisation in modern society remains open. thus, while the appearance of "sure*", "fail*", and some other minor affect-related terms in the death-community may be truly indicative of emotion in the conversation around covid-19 fatalities, the presence of many of the most highly co-occurring affect-related words in this predominantly death-related community could be explained by their appearance in common phrases related to covid-19 fatalities, e.g. the "death care" and "health care" industries, "care homes", "testing positive" for the virus etc. these words, therefore, do not necessarily reveal emotion in the current context. we thus argue once again that this co-occurrence network and its community structure shows that death-and affect-based words are well-separated, consistent with our claim of psychophysical numbing. 24th may -14th june our argument remains unchanged for this period (see fig. 3c ). the only notable difference for this period is that a significant proportion of the conversation surrounding death is focused on the political issues that inspired the black lives matter protests and the protests themselves. this is apparent from the appearance of the word "protests" in the left-hand side's death-related community. altogether, this analysis demonstrates that words indicating a subjective emotional/affective experience and words related to death are well-separated in this twitter data, which is consistent with the notion of psychophysical numbing as an explanation for the trends and correlations observed in figures 1 and 2 . for completeness, we include the equivalent co-occurrence graphs for the spanish-language tweets in appendix b.2, from which similar conclusions can be drawn. in the previous section, we demonstrated our finding that as the pandemic intensifies, the proportion of words that appear in the set of tweets posted in each country that indicate emotion diminishes over time. this indicates that the actual emotional response to the pandemic diminishes as the intensity of the pandemic increases, implying a psychophysical numbing effect. we supported this explanation by showing that the word co-occurrence networks induced by our set of tweets host a community structure that separates words in the death and affect dictionaries, suggesting that people do not talk about covid-19 deaths in a highly emotional tone. the following sections model the relationship between the progression of the covid-19 pandemic and the twitter users' perception using grounded theories of psychophysical numbing. our analysis suggests that the public's perception of the progression of the pandemic is logarithmic or, at least, sublinear. from figure 2 , we observe that the correlation magnitudes between nlss and epidemiological data are generally larger in absolute value whenever the latter are taken in logarithmic scale. to exemplify this observation, we show in figure 4 the z-scores 7 of the death nlss and of the logarithm of the daily number of deaths and cases within each country. the general correspondence between all three normalised features in each country is striking 8 . we propose that this can be explained in terms of the weber-fechner law [4] , which is a quantitative statement with its origins in psychology and psychophysics regarding humans' perceived magnitude p of a stimulus with physical magnitude s. it states that a human's perception of the magnitude of a stimulus varies as the logarithm of the physical magnitude s of the stimulus, meaning we are more sensitive to ratios when comparing different physical magnitudes than we are to absolute differences. in the continuum limit, eq. (1) gives the following functional form for the weber-fechner law: 7 recall that the z-score of a sequence of observations y = (y 1 , · · · , y t ) is given by z = (y − µ y )/σ y , where µ y and σ y are the mean and standard deviation of y, respectively. 8 we note that the correspondence is weaker for australia, nigeria, and south africa due to the relatively low number of cases in these countries (see fig. 9 in the appendix for reference). the correspondence is also weaker in spain, for two reasons: due to its revision of the number of cases in late may, resulting in a day of "negative deaths"; and due to their having recorded a day with no covid-19-related deaths, which was a significant event given that spain had seen many deaths until that point. (c) may 24th to june 13th, 2020. figure 3 : snapshots of the word co-occurrences associated with death (green labels) and affect (red labels) for english-language tweets aggregated across all analyzed countries in three different time windows (see sub-captions). the nodes are coloured according to their community label as obtained by maximising modularity with the louvain algorithm [38] . we filtered edges with weight below 20 co-occurrences for visualisation purposes. (t) and the national daily death rate is given in parentheses for each country. data is smoothed with a 3-day moving average and standardized with their z-score to make them visually comparable. vertical lines represent peaks in the death discourse caused by exogenous events (see main text for details) which we remove from the time series. table 2 : results from the fit of the weber-fechner law to the observed relationship between the death nls and the logarithm of the daily number of deaths in each country (see figure 4) . overall, this model best describes the relationship between the daily number of deaths local to each country and the death nls. where k and s 0 are real-valued parameters and r(t) the residual. parameter k determines the sensitivity of perception to changes in the stimulus s, while s 0 determines the minimum threshold that the stimuli s must overcome in order to be perceived. the residual term r(t) is a random variable representing noise not directly captured by the stimulus. for instance, exogenous events can trigger abrupt peaks in the death score. this is the case, for example, with the murder of george floyd in the united states, or the peak in nigeria around april 17th 2020, triggered by a number of prominent african figures dying from covid-19 around that day, including the nigerian president's top aide (see [39] ). in order to test the weber-fechner law, we fit 9 a linear regression model to p death i (t), the death nls time series in country i, and log s i (t), the daily number of deaths in the same country, and summarize the results of these fits in table 2 . we find that eq. (6) accurately models the data, with significant coefficients (p-value < 0.01) for all countries except spain. the sensitivity parameter k has the same order of magnitude for all significant countries. however, the country with the lowest k is ∼ 3 times less sensitive than the highest, indicating that twitter users in different countries may react differently to the evolution of the pandemic. the minimum stimuli threshold s 0 , in the other hand, is always small: most countries, except for the united states and the united kingdom, need only one covid-19 death in a given day in order to be perceived. conversely, the united states and united kingdom need approximately 5 and 6 deaths to be perceived, which is small compared to the thousands of daily deaths registered in these countries during the observation period. table 3 : the results from the fit of a power law to the relationship between the death nls and the national daily death count. this is the best model in some cases, though is outperformed by the weber-fechner law most times. *while we fit this model assuming a log-log relationship between p and s, we compute r 2 with linear p to make it comparable to the model implied by the weber-fechner law (see eq. (10) in appendix a for details). this may cause negative values of r 2 as is the case for spain. an alternative functional form for the relationship between human perception p of a stimulus and the physical magnitude s of the stimulus is a power law relationship where ν and β are parameters determining the perception from a stimulus of unit magnitude and the growth rate of the perception as a function of the stimulus magnitude, andr(t) is a residual term. this form has been shown to outperform the weber-fechner law in characterising human perception in a number of empirical studies [5] . we also therefore report the results of this model fit to the relationship between the death nls p death i (t) and national daily death counts s i (t) for each country i, reporting our results in table 3 . in all cases, we observe sublinear exponents β for the perception of the daily deaths data, with significant exponents (p-value < 0.01) ranging between 0.085 and 0.36. these exponents are of the same order of magnitude as the β of 0.32 reported in [6] , where in several laboratory experiments they measure psychophysical numbing in participants' perception of death statistics. as discussed previously, the data for spain is unusual for a number of reasons, thus the model does not accurately describe the data in this instance. these results suggest that twitter users in certain countries are more sensitive to change in the number of deaths than others. both the weber-fechner law and power-law relationships between the death nls and the daily number of reported deaths accurately model the data. each captures the phenomenon in which "the first few fatalities in an ongoing event elicit more concern than those occurring later on" [40] . by way of comparison, we present in table 4 (nrmse), defined as for these models, in addition to a linear model between p death i (t) and s i (t) as a baseline "null" model. here, e(t) = p(t) −p(t) is the model residual, and n is the sample size. the models are directly comparable in this sense, since each involves only two parameters. bhatia [41] made a similar model comparison to test psychophysical laws for subjective probability judgements of real-world events, in that case finding that the linear relationship was the best. in our case, however, a linear relationship between s and p is significantly worse than the present concave models of perception (see appendix a for the results of the linear model), reinforcing our hypothesis of psychophysical numbing. while the weber-fechner law is better than the power law model overall, the difference in their goodness of fit -as measured by the nrmse -is marginal. both are reasonable descriptions of the observed relationship, and similar conclusions can be drawn from both. in particular, the parameters k and β from the weber-fechner law and power law, respectively, are analogous in their interpretation as the measure of the sensitivity of the nation's twitter users to changes in the national covid-19 daily death rate. to illustrate this, we rank the countries in our dataset in order of sensitivity to changes in the local death rate, as measured separately by these two parameters, and plot the correlation between the countries' ranks in figure 5 . here, low rank indicates high sensitivity to changes in the number of daily deaths nationally. the correlation between the two methods of ranking -according to k, the weber-fechner law slope parameters, and according to β, the power law model exponents -is high, with correlation coefficient 0.77. this shows that the sensitivity of each country is relatively robust between models. by both measures, therefore, twitter users tweeting in english and spanish from australia and argentina, respectively, appear to be the most sensitive to changes in the national daily death rate, while twitter users posting in english from south africa, india, and nigeria and in spanish from spain and chile appear to be the least sensitive to these changes. figure 5 : comparison of the rank of each country as determined by their k and β parameters in the weber-fechner and power-law fits, respectively, which determine the sensitivity of twitter users tweeting from each country to changes in the number of daily reported deaths. low rank indicates high sensitivity relative to the remaining countries. the correlation between countries' ranks from both measures is high at 0.77. we explored the country-by-country relationship between the linguistic features present in a large set of tweets posted in relation to the covid-19 pandemic, and the progression/intensity of the pandemic as measured by the daily number of cases and deaths in each country we consider. by considering the change, relative to a baseline, in the percentage of words present in each tweet that are associated with a number of psychologically meaningful categorieshere called linguistic scores -we observed significant trends that we believe are indicative of a psychophysical numbing effect [7] . we found that the national linguistic scores (nlss, see eq. (4)) associated with emotion and affect decrease as the pandemic intensifies. this is in spite of a greater attentional focus on death and mortality and a simultaneous increase in use of words indicating analytic reasoning. we showed, by constructing word co-occurrence networks on different time periods of the pandemic, that words related to death co-occur more frequently with other words related to death than they do with words indicating affect and emotion, and that this separation of affect from the conversation around death is also revealed by the community structure of this network. this is consistent with the notion of psychophysical numbing, which we believe explains these observations. we also showed that the psychophysical laws of weber-fechner and of power law perception in humans accurately model the relationship between the frequency of words related to death and the actual daily number of covid-19 deaths in each country. we estimated sub-linear exponents in the power law perception function that are of similar values to values previously estimated from psychological experiments [6] . these exponents, together with parameter k of the weber-fechner law (see eq. (6)), tell us how sensitive the twitter users in each country are to their national covid-19 daily deaths, and were seen to vary by country, indicating intercountry differences in risk perception and sensitivity to death rates. such sensitivities were consistent across models (see fig. 5 ) suggesting that these measures of a nation's twitter users' sensitivities to changes in the national death rate are robust features of the data. our findings illustrate the signaling power of twitter, and demonstrate its potential use as a tool for monitoring public perception of risk during large-scale crisis scenarios. with the modelling and visualisation approaches we employ in this paper, policy-makers and public officials could track in near-to-real-time the public's attitudes towards threats to public well-being and the prevalence of factors important to public perception of risk, including degree of outrage and relative attentional focus on the threat. our findings also imply a functional form for agent perception of the system state in models of opinion dynamics. this will be instrumental for developing coupled opinion dynamics-epidemiological models, in which the bidirectional relationships between human perception, human behaviour, and epidemic progression are modelled endogenously. a natural extension to this work would involve nowcasting and/or forecasting of certain economic indicators. it has also been limited in that we assumed that only the national death rate is a significant predictor of perception. a more complete analysis should account for the effect of other countries' death statistics as a driver of local perception, or more broadly an advancement of a process-level explanation of the cross-cultural differences we observe in the sensitivity to death statistics. this analysis could also be enhanced by relating these measures of risk perception to behavioural data, which -since "people's behavior is mediated by their perceptions of risk" [10] -may be useful for understanding the role of emotions in driving behaviours that are conducive to public health during crises. further, a deconstruction of the aggregate indicators we have developed to the state and regional level may be necessary to more accurately characterise the relationship between local crisis progression and human risk perception. we also stress that the results presented in this paper may be indicative only of the responses of twitter users posting from each of these countries in each of these languages, so extrapolating these results to the broader population will only be possible with a better understanding of the biases present in, and representativeness of, the dataset at hand. bk acknowledges funding from the conacyt-sener: sustentabilidad energtica and jd acknowledges funding from the epsrc industrially focused mathematical modelling centre for doctoral training centre. the authors declare that they have no competing interests. the twitter data used in the manuscript is collected and maintained by banda et al. at the panacea lab [25] , and it is available at their website http://www.panacealab.org/covid19. the data on covid-19 confirmed cases and deaths were obtained from the "coronavirus pandemic (covid-19)" page of the our world in data website [27] , and the stable url for this data is https://covid.ourworldindata.org/data/owid-covid-data.csv. bk and jd both conceived the idea, carried out the analysis, and wrote, read, and approved the final manuscript. • liwc: linguistic inquiry and word count. • who: world health organization. • nls: national linguistic score. in this section, we present further results of our models to give a more complete overview of their quality. besides the weber-fechner law and power law models (see eqs. (6) and (7)), we use the following linear relationship between s and p as our benchmark model where a and b are parameters. we summarize our results for the linear model in table 5 . for all models, we compute the r 2 values where e(t) = p(t) −p(t) is the model residual, σ 2 p = n t=1 (p(t) − µ p ) 2 /(n − 1) is the variance of p(t), and n is the sample size. the r 2 values for all models are summarized in table 6 . (note that as the power law model implies a log-normal residual, the r 2 values can be negative.) from this table we see that, once again, the weber-fechner law is generally a better fit to the data across all countries, but that the power law and weber-fechner models are often comparable and significantly better than the linear model. we also show in figures 6 and 7 scatterplots of the death nlss against the logarithm of the daily number of deaths in each country, with the y-axis in linear-and log-scales, respectively. red lines indicate the line of best fit, with the slope equal to k and β in eqs. 6 and 7, respectively. b word co-occurrence analysis in constructing the word co-occurrence networks presented in section 3.2.1, we preform basic text preprocessing, including taking the lower-case form of all letters, removing urls, removing punctuation, and removing the following small set of stopwords from the vocabulary: to, today, too, has, have, like. we retain hashtags, since liwc also recognises hashtags and because hashtags are an essential aspect to communications on twitter. it is also necessary to account for the fact that a number of "words" appearing in the liwc dictionary are in fact regular expressions to which many complete words in the twitter dataset map. for example, the "word" "isolat*" appears in the english liwc dictionary, to which each of the following words would map: "isolate", "isolated", "isolating". thus, construction of the word co-occurrence networks g i involves a two-step procedure: first, constructing the raw word co-occurrence networks g i , in which the nodes are words exactly as they appear in the twitter dataset; and then reducing this to a quotient graph g i by contracting nodes in g i that are matched by the same regular expression in the liwc dictionary. more formally: the liwc dictionary implies an equivalence relation ∼ on the vocabulary v implied by the twitter dataset, such that v ∼ u for words v, u ∈ v if both v and u are matched by the same regular expression in the liwc dictionary. the weights of edges between nodes v ⊂ v and u ⊂ v in g i are then taken to be where w g (x, y) is the weight of edge (x, y) in g. note that w g (x, y) = w g (y, x) and w g (x, y) = 0 if (x, y) is not an edge in g. for completeness, we provide here the word co-occurrence graphs for the spanish language tweets. we omit a discussion of the results, since similar conclusions can be drawn from these as in the english counterparts. we include this section as a reference for the actual number of deaths in each country for the period we analysed throughout the paper, which we present in fig. 9 . covid-19 government response tracker, blavatnik school of government risk perception and behaviors: anticipating and responding to crises psychophysical numbing: an empirical basis for perceptions of collective violence if i look at the mass i will never act": psychic numbing and genocide insensitivity to the value of human life: a study of psychophysical numbing psychophysical numbing: when lives are valued less as the lives at risk increase perception matters: psychophysics for economists #lockdown: network-enhanced emotional profiling at the times of covid-19 the circumplex model of affect:an integrative approach to affective neuroscience, cognitive development, and psychopathology emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon mood of india during covid-19 -an interactive web portal based on emotion analysis of twitter data go eat a bat risk perceptions of covid-19 around the world relationships between initial covid-19 risk perceptions and protective health behaviors: a national survey risk perception through the lens of politics in the time of the covid-19 pandemic distributional structure the effect of differential failure on expectation of success, reported anxiety, and response uncertainty discrepancy from expectation in relation to affect and motivation: tests of mcclelland's hypothesis affective responses to uncertain real-world outcomes: sentiment change on twitter predicting risk perception: new insights from data science estimating geographic subjective well-being from twitter: a comparison of dictionary and data-driven language methods a largescale covid-19 twitter chatter dataset for open scientific research -an international collaboration openstreetmap contributors coronavirus pandemic (covid-19) the psychological meaning of words: liwc and computerized text analysis methods the development and psychometric properties of liwc2015 anxious or angry? effects of discrete emotions on the perceived helpfulness of online reviews predicting elections with twitter: what 140 characters reveal about political sentiment the pragmatics of swearing beyond psychic numbing: a call to awareness responding to community outrage: strategies for effective risk communication using social and behavioural science to support covid-19 pandemic response risk as feelings covid-19: government handling and confidence in health authorities fast unfolding of communities in large networks africa's top virus deaths the cognitive psychology of sensitivity to human fatalities: implications for life-saving policies vector space semantic models predict subjective probability judgments for real-world events snapshots of the word co-occurrences associated with death ("muerte", green labels) and affect ("afecto", red labels) for spanish-language tweets aggregated across all analyzed countries in three different time windows (see sub-captions). the nodes are coloured based on the community labels obtained by maximising modularity using the louvain algorithm we filtered edges with weight below 20 co-occurrences for visualisation purposes the authors would like to thank mirta galesic, rodrigo leal cervantes, rita maria del rio chanona, françois lafond, and j. doyne farmer for helpful feedback, and to the oxford inet complexity economics group for stimulating discussions. key: cord-300651-4didq6dk authors: sun, ya-jun; feng, yi-jin; chen, jing; li, bo; luo, zhong-cheng; wang, pei-xi title: clinical features of fatalities in patients with covid-19 date: 2020-07-14 journal: disaster medicine and public health preparedness doi: 10.1017/dmp.2020.235 sha: doc_id: 300651 cord_uid: 4didq6dk objectives: the novel coronavirus disease 2019 (covid-19) pandemic has spread to over 213 countries and territories. we sought to describe the clinical features of fatalities in patients with severe covid-19. methods: we conducted an internet-based retrospective cohort study through retrieving the clinical information of 100 covid-19 deaths from nonduplicating incidental reports in chinese provincial and other governmental websites between january 23 and march 10, 2020. results: approximately 6 of 10 covid-19 deaths were males (64.0%). the average age was 70.7 ± 13.5 y, and 84% of patients were elderly (over age 60 y). the mean duration from admission to diagnosis was 2.2 ± 3.8 d (median: 1 d). the mean duration from diagnosis to death was 9.9 ± 7.0 d (median: 9 d). approximately 3 of 4 cases (76.0%) were complicated by 1 or more chronic diseases, including hypertension (41.0%), diabetes (29.0%) and coronary heart disease (27.0%), respiratory disorders (23.0%), and cerebrovascular disease (12.0%). fever (46.0%), cough (33.0%), and shortness of breath (9.0%) were the most common first symptoms. multiple organ failure (67.9%), circulatory failure (20.2%), and respiratory failure (11.9%) are the top 3 direct causes of death. conclusions: covid-19 deaths are mainly elderly and patients with chronic diseases especially cardiovascular disorders and diabetes. multiple organ failure is the most common direct cause of death. i n december 2019, several cases of pneumonia of unknown cause were reported in wuhan, china that were later recognized as a novel coronavirus infection, named coronavirus disease 2019 (covid-19) by the world health organization (who). 1 has been included in the laws of the people's republic of china in the prevention and treatment of infectious diseases as a class b infectious disease. all provinces and cities in china have taken first-level public health emergency responses to contain the transmission of the disease and protect vulnerable populations. the epidemic has spread across china as well as into 213 countries and territories. 2 the covid-19's socioeconomic impacts have already far exceeded those of severe acute respiratory syndrome (sars) and the middle east respiratory syndrome (mers), and the pandemic has become a worldwide major public health concern. as of june 21, 2020, the global number of confirmed cases of covid-19 exceeded 9 million, leading to over 470 thousand of fatalities. 2 we attempted to describe the clinical characteristics of fatalities in patients with covid-19, which may inform the clinical management of patients with severe covid-19. this was an internet-based data intelligence study. we constituted a cohort of covid-19 deaths through retrieving the clinical information on covid-19 fatalities from nonduplicating incidental reports in chinese provincial and metropolitan city health commission and other governmental official websites between january 23 and march 10, 2020. the reported clinical characteristics included the patient's age, sex, initial onset symptoms, pre-existing chronic diseases, direct cause of death, date of admission, date of diagnosis, and date of death. the study cohort included 100 cases of covid-19 fatalities. the study was approved by the research ethics committee of henan university. informed consent was waived, because the study was based on publicly available anonymized incidental fatality reports. spss (version 22.0) software was used for statistical analysis. mean ± standard deviations (sd) and median (inter-quartile range) were presented for continuous variables, while frequency and percentage were presented for categorical variables. patients were not involved in the study, which is based on anonymized incidental covid-19 fatality reports from governmental websites. approximately 6 of 10 covid-19 fatalities (64.0%) were males ( table 1 ). the average age was 70.7 ± 13.5 y (median: 72.5 y), and approximately 8 of 10 patients (84.0%) were over 60 y of age. the mean duration from admission to diagnosis was 2.2 ± 2.8 d (median: 1). the average duration from the diagnosis to death was 9.9 ± 6.5 d (median: 9 d). approximately 3 of 4 fatal covid-19 cases (76.0%) had 1 or more pre-existing chronic illnesses. the prevalence rates were 41.0% for hypertension, 29% for diabetes, 27.0% for coronary heart disease, 23% for respiratory disorders, 12% for cerebrovascular disease, 3% for cancers, 5% for abnormal renal function, and 2.0% for parkinson's disease. approximately half of patients (48%) had 2 or more chronic diseases. fever (46.0%), cough (33.0%), shortness of breath (9.0%), fatigue and weakness (8.0%), sputum (7.0%), and dyspnea (7.0%) were the common symptoms at onset, whereas palpitations and diarrhea were less frequent. among the 100 covid-19 fatalities, 16 cases were missing data on direct cause of death. of the 84 covid-19 cases with known direct cause of death, the top 3 common direct causes of death were multiple organ failure (67.9%), circulatory failure (20.2%), and respiratory failure (11.9%), and were similar (p = 0.50) for males and females: multiple organ failure (64.8% vs 73.3%, respectively), circulatory failure (24.1% vs 13.4%, respectively), and respiratory failure (11.1% vs 13.3%, respectively). in this internet-based data intelligence study, we observed that the majority of covid-19 deaths were elderly (approximately 8 of 10) and males (6 of 10), and most fatalities (3 of 4) occurred in patients with chronic illnesses. the findings were consistent with a recent report in a hospital-based study, 3 and with the who report on covid-19 in china, 4 and demonstrate the usefulness of an internet-based data intelligence study. previous studies have not clarified the direct causes of death. our data indicate that the most common direct cause of death is multiple organ failure (approximately 2 of 3). the initial onset symptoms are not so much saliently worrisome, but the median duration from diagnosis to death was only 9 d, indicating that the disease can worsen rapidly, costing life. the function of innate immunity and neutrophil function may degrade with aging, exposing the elderly to the more deleterious impact of the new coronavirus infection. similar to sars and mers, covid-19 presents a clear male sex bias. 5 compared with males, the immune response in females may be more vigorous with higher antibody levels following exposure to an infectious agent; 6 thus, females may be less vulnerable to the deleterious consequence of covid-19 infection. it has been speculated that women's lower susceptibility to viral infections may be related to genetic factors associated with the x chromosome and sex hormones. another possible explanation for the higher incidence and more male covid-19 fatalities may be due to that males are likely to spend more time outdoors, increasing the chances of exposure to the virus. fatalities occurred in patients with chronic illnesses. 3, 7 the top 4 were hypertension (41.0%), diabetes (29.0%), coronary heart disease (27.0%), and respiratory disorders (23.0%). previous studies indicate that covid-19 shares the same receptor with sars-cov, and the angiotensin-converting enzyme-2 (ace2) sensitive cell surface receptors mediate the entry of the virus into the target cells. 8 ace2, the functional receptor of sars-cov, is expressed in the islet, through which the virus may invade and destroy the pancreatic islet cells, thus may aggravate diabetes and accelerate the disease progression. the immune system plays a crucial role when the body is confronted with viruses or bacteria. for patients with diabetes, especially those with poor blood glucose control, long-term exposure to hyperglycemia may lead to circulatory failure 17 (20.2) respiratory failure 10 (11.9) a a total of 12 cases were missing data on the time from diagnosis to death. b a total of 16 cases were missing data on direct cause of death. disaster medicine and public health preparedness decreased immune function. other chronic illnesses may also compromise the patient's immune defense system leading to severe consequences. multiple organ failure, respiratory failure, and circulatory failure were the main direct causes of deaths. similar to mers-cov, 9 multiple organ failure appears to be a common direct cause of death in covid-19 fatalities. the covid-19 infection may lead to increased blood capillary permeability of the lungs, 10 aggravating inflammation and apoptosis, with lung injuries leading to respiratory distress syndrome. the virus may set off an immune inflammatory response storm, causing tissue damages in multiple organs leading to multiple organ failure. this study has some limitations. first, we did not have the laboratory data in this internet reports-based study. the reported clinical characteristics are relatively limited in internet reports. it is unclear whether there is a selection bias in internet reports of covid-19 fatalities compared with those fatalities in the general population. however, our data on age and sex distributions of covid-19 deaths are consistent with the recent report on 113 deaths in a single large hospital-based study in wuhan, china. 3 in conclusion, covid-19 deaths are mainly elderly and patients with chronic diseases, especially cardiovascular disorders and diabetes. multiple organ failure is the most common direct cause of death. our findings may inform clinical healthcare professionals in better management of severe covid-19 patients in fighting the emerging pandemic. clinical features of patients infected with 2019 novel coronavirus in wuhan covid-19 coronavirus pandemic clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study who. report of the who-china joint mission on coronavirus disease sex-based differences in susceptibility to severe acute respiratory syndrome coronavirus infection sexual dimorphism in innate immunity case-fatality rate and characteristics of patients dying in relation to covid-19 in italy genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding middle east respiratory syndrome pulmonary vascular endothelialitis, thrombosis, and angiogenesis in covid-19 we acknowledge zhong-xiang wang (school of nursing and health, zhengzhou university) for his helpful suggestions and spiritual support. all authors contributed to the development of the conceptual framework of this study. p.x.w. and z.c.l. initiated the study, supervised the collection of research data. y.j.s., y.j.f., and c.j. collected the data. b.l. contributed to data interpretation. y.j.s. analyzed the data and drafted the manuscript. all authors contributed to critical revisions of the study, and approved the final version for publication. this work was supported by research grants from the foshan covid-19 emergency technology project (2020001000376) and the canadian institutes of health research (cihr grant # 155955). the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. the study was approved by the research ethics committee of henan university (husom 2020-0 11), and informed consent was waivered because the study was based on publicly available anonymized incidental fatality reports. the study data are available from the corresponding author upon reasonable request. key: cord-350261-7lkcdisr authors: asirvatham, edwin sam; sarman, charishma jones; saravanamurthy, sakthivel p.; mahalingam, periasamy; maduraipandian, swarna; lakshmanan, jeyaseelan title: who is dying from covid-19 and when? an analysis of fatalities in tamil nadu, india date: 2020-10-03 journal: clin epidemiol glob health doi: 10.1016/j.cegh.2020.09.010 sha: doc_id: 350261 cord_uid: 7lkcdisr background: as the number of covid-19 cases continues to rise, public health efforts must focus on preventing avoidable fatalities. understanding the demographic and clinical characteristics of deceased covid-19 patients; and estimation of time-interval between symptom onset, hospital admission and death could inform public health interventions focusing on preventing mortality due to covid-19. methods: we obtained covid-19 death summaries from the official dashboard of the government of tamil nadu, between 10th may and july 10, 2020. of the 1783 deaths, we included 1761 cases for analysis. results: the mean age of the deceased was 62.5 years (sd: 13.7). the crude death rate was 2.44 per 100,000 population; the age-specific death rate was 22.72 among above 75 years and 0.02 among less than 14 years, and it was higher among men (3.5 vs 1.4 per 100,000 population). around 85% reported having any one or more comorbidities; diabetes (62%), hypertension (49.2%) and cad (17.5%) were the commonly reported comorbidities. the median time interval between symptom onset and hospital admission was 4 days (iqr: 2, 7); admission and death was 4 days (iqr: 2, 7) with a significant difference between the type of admitting hospital. one-fourth of (24.2%) deaths occurred within a day of hospital admission. conclusion: elderly, male, people living in densely populated areas and people with underlying comorbidities die disproportionately due to covid-19. while shorter time-interval between symptom onset and admission is essential, the relatively short time interval between admission and death is a concern and the possible reasons must be evaluated and addressed to reduce avoidable mortality. tamil nadu, india as of 10 th july 2020, the sars-cov-2 has infected around 820,000 individuals with 22 ,000 deaths in india (1). tamil nadu, a south indian state with a population of around 72 million (2) , reported around 130,000 cases and 1,829 deaths which was 16% of total confirmed cases and 8.3% of total deaths in india (3). the rapid spread of the disease has undoubtedly become a burden to health systems in several countries, as a significant proportion of elderly, immunosuppressed and those with underlying metabolic, cardiovascular or respiratory diseases continue to develop severe forms of the covid-19 and are at an increased risk for adverse outcomes (4) . at the same time, evidence is emerging to caution that young and adult general population are also at considerable risk for critical illness and adverse outcome (5) . as the number of cases continues to increase, public health efforts must focus on preventing avoidable fatalities. when a health system is burdened beyond its capacity and the morale of health care workers are affected, the standard of care would be compromised, leading to negative health outcomes. current therapeutic strategies to deal with the covid-19 infection are only supportive, and prevention efforts aimed at reducing transmission in the community is considered as the most effective method (6) . however, the fatality due to covid-19 could be reduced, if there is early and accurate diagnosis, identification of clinical features of severe risks, prediction of disease progression and appropriate clinical intervention. further, early seeking of medical care by people with exposure and symptoms, especially the most vulnerable, could substantially reduce the spread of infection, severity and fatality due to this disease and produce better clinical outcome (7) . currently, the covid-19 infection has its presence across the globe, generating new information, fresh knowledge and evidence continuously. however, there are still many unknowns and ambiguity about the demographic and clinical characteristics of deceased covid-19 patients; the different time intervals between the time of infection and outcomedeath or recovery (8) . the currently available literature indicated, information varying contextually, across regions and countries, emphasizing the need for generating evidence for a specific geography, population, and context. this study aims to understand the demographic and clinical characteristics of deceased covid-19 patients; and estimate the time-interval between symptom onset, hospital admission and death, which could inform public health interventions focusing on preventing mortality due to covid-19. . we obtained covid-19 death summaries from the official dashboard of the government of tamil nadu (https://stopcorona.tn.gov.in/), a south indian state. each death summary consisted of information such as district, age, gender, type of admitting hospital, presence of comorbidities, presenting symptoms, number of days with symptoms, date of hospital admission and date of death. we collected information from 10 th may 2020 to 10 th july 2020. a total of 1,783 deaths were reported during the period. we excluded brought dead cases (22) and finalised 1,761 cases of deaths for analysis. for the analysis of comorbidities, we included only the cases that reported the presence or absence with details of comorbidities. for the analysis of presenting symptoms, we included only the death summaries that indicated the presence of symptoms with details, as the absence of symptoms j o u r n a l p r e -p r o o f on admission is not provided in death summaries. for the estimation of time intervals between symptom onset and hospital admission; admission and death, we excluded the cases referred from other hospitals due to the potential influence of the time intervals; and cases who did not have all the information. we summarised the categorical variables as frequency and percentages; and continuous variables as mean, standard deviation (sd), median, and interquartile range (iqr) as appropriate. we analysed the continuous variables using the independent t-tests or mann-whitney test/ kruskal-wallis h test. the proportions for categorical variables were analysed using chi-squared test (χ2) test. the analyses were performed using spss version 24.0 (ibm corp., armonk, ny, usa). among cases (1,678) that reported the presence or absence with details of comorbidities, 85.3% reported any one or more comorbidities at the time of hospital admission (table-2 ). diabetes was found to be the most common comorbidity associated with 62% of the deceased; hypertension and cad were present among 49.2% and 17.5% of the deceased respectively. the coexistence of diabetes and hypertension; diabetes, hypertension and cad were found among 36.6% and 8.7% of the individuals respectively. as expected, the study found a significantly higher presence of comorbidities among the elderly compared to the younger age groups (p<0.001), in terms of the presence of any one or more comorbidities, diabetes, hypertension, chronic obstructive pulmonary disease (copd), coronary artery disease (cad), the coexistence of diabetes and hypertension, and the coexistence of diabetes, hypertension and cad. as age increased, the presence of comorbidities seems to increase significantly. the presence of any one or more comorbidities (p=0.002) and comorbidities categorised as others (p<0.001) were found to be higher among women. private hospitals reported to have treated a significantly higher proportion of deceased patients with comorbidities such as diabetes (p=0.01), hypertension (p<0.001), cad (p<0.06), asthma (p<0.06), the coexistence of diabetes and hypertension (p<0.001), and j o u r n a l p r e -p r o o f the coexistence of diabetes, hypertension and cad (p=0.007), whereas public hospitals reported to have treated higher proportion of patients with ckd (p<0.001). around 18% of deceased reported having other comorbidities such as hypothyroidism, dementia, encephalopathy, cerebrovascular diseases (cvs), hepatitis etc., however, the presence of these as independent comorbidity was found to be very low (3.6%, 61). around 583 deaths summaries consisted of the details of presenting symptoms on admission (table-3) . fever was the most common presenting symptom (78.7%) reported by 80.5% of the deceased men and 73.7% of women. breathing difficulty was reported by 75.8% of the patients (76.1% of men, 75% of women); around 53% had cough (54.8% of men and 49.3 % of women) and 35.8% had fever, cough and breathing difficulty together (37.4% of men and 31.6% of the women), however, the differences are not statistically significant. diarrhoea and generalised weakness/myalgia were reported by 6.2%, and 7.5% of the patients respectively, with women reporting significantly higher than men (p=0.01; p=0.02). fever was reported to be significantly higher among the older population (p=0.001). the median time interval between onset of symptoms and hospital admission was 4 days (iqr: 2, 7) without significant difference among gender and age groups (table-4 ). the patients who were admitted in private hospitals had a median of 4 days of symptoms as compared to public hospital patients with 3 days of symptoms (p<0.005). the median time interval between hospital admission and death was 4 (iqr: 2, 7) and there are no significant differences among gender and age groups. however, it was significantly higher, 6 days (iqr: 2, 10) in private hospitals compared to 3 days (iqr: 1, 6) in public hospitals (p<0.001). around one fourth of (24.2%) of the reported deaths occurred within a day of hospital admission (public 27.8%; private 15.3%), 23.7% between 2 to 3 days (public 26.6%; private 16.7%), 8.1% between 4 to 5 days (public 8.5%; private 7.2%) and 43.4% occurred after 5 days of admission (public 36.5%; private 60.2%) (p<0.001). it is well documented that the covid-19 pandemic takes different shapes and forms with varying mortality levels across geographic regions and countries. though there are several studies from other countries that explained the characteristics of covid-19 deaths, there is a dearth of peer-reviewed and published literature from india. our study analysed the individual death summaries, and described the demographic and clinical characteristics of deceased covid-19 patients; and estimated the time intervals between symptoms onset to hospital admission and death, which are critical for developing context and geographicspecific public health interventions focusing on reducing the mortality. our study findings indicated a disproportionate death rate among the categories of age, gender and geography. it is well demonstrated that age is the most significant risk factor for death due to covid-19 and our study confirms the existing evidence. the increasing death rate with age is expected and it could be due to the higher prevalence of comorbidities, the reduced and less responsive innate and adaptive immune system among the elderly (9, 10). the study reported a higher proportion of deaths (71%) among men, though the proportion of j o u r n a l p r e -p r o o f total confirmed male cases was only 61% in the state (11) ; and death per 100,000 male population was 3.4 as compared to 1.4 deaths per 100,000 female population. the less mortality among women has been reported in many studies which could be due to the protection of x chromosome and sex hormones, which play an important role in providing innate and adaptive immunity (12) . the higher mortality among men could be due to the behavioral risk factors such as smoking, and alcohol consumption, which are relatively higher among men in india (13) . around 67% of deaths occurred in the capital district/city that recorded 58% of the total cases and just 6.4% of the population of the state. it is also the smallest and densest of all the districts in the state (2). the possible reasons could be, lack of timely access to healthcare facilities, delayed seeking of care, overwhelming of health system due to sudden surge of cases due to larger and exponential spread of the virus in the city. the disease severity, increased admission rate into the intensive care unit (icu), and increased risk of mortality of covid-19 are strongly associated with comorbidities such as diabetes, hypertension, obesity, cardiovascular disease, and respiratory system diseases and our study results confirm the previous findings (4, 14, 15) . a study showed a hazard ratio of 2.59 among patients with two or more comorbidities compared to 1.79 among patients with one comorbidity(4). the centers for disease control and prevention reported 12 times higher deaths among patients with reported underlying conditions compared with those without reported underlying conditions (19.5% versus 1.6%) (16) . in our study, the prevalence of any one or more comorbidities among the deceased was found to be around 85%, and a significant proportion of deceased having other comorbidities such as diabetes, hypertension, and cad respectively, with a strong association with age. studies in china reported around 70% of deaths with any one comorbidity (4, 17) , south korea and brazil reported 83% and j o u r n a l p r e -p r o o f 90.7%; and these studies have reported hypertension, cad, and diabetes as the main comorbidities among deaths (18, 19) . in our study, fever and breathing difficulty each was reported among 3/4 th of the deceased, cough among half of the patients and around 1/3 rd had fever, breathing difficulty and cough together, which are in line with the existing literature that fever, dry cough, shortness of breath and fatigue were the common symptoms on admission among the deceased patients (14, 17) a meta-analysis of covid-19 patients, showed fever (88.8%) as the most common symptom, followed by dry cough (68%) and fatigue (20) . another study that reviewed 24,410 cases across the world showed 78% of the cases with fever and 58% with cough (21) . though the study found an association between fever as a presenting symptom and age of the deceased, other symptoms and multiple concurrent symptoms did not indicate any association with age, gender and admitting hospital. the median time interval between symptom onset and hospital admission was found to be 4 days, which is within the range mentioned in studies from china (4, 10 days), singapore (4 days), italy (7 days) that indicated a range of 3 to 10 days (14, (22) (23) (24) . the duration might change during the different phases of the epidemic due to rapid changes in the level of knowledge and awareness, stigma and discrimination, fear of the disease, programmatic interventions, health-seeking behaviors, and access to health care services. as the epidemic progresses and there is continuous scale-up of public health and social interventions, the duration is expected to decrease, which could potentially limit the infection to others. according to our study, the median time interval between hospital admission and death was 4 days with a significant difference between patients admitted in private and public hospitals. other countries reported a slightly higher, but a wide range of 5 to 16 days of time interval j o u r n a l p r e -p r o o f between hospital admission and death (14, (25) (26) (27) . a study that reviewed the length of stay across the world reported a shorter length of stay for those who died in hospital compared to those were discharged alive, with medians between 4 and 21 days compared to 4 and 53 days, respectively (28) . the shorter period of hospital stay with negative outcome could be due to the delayed seeking of care due to economic reasons and lack of awareness about the disease, shortage of icu facilities and ventilators, and overwhelming of hospitals that affect the capacity to deliver services effectively. the knowledge and awareness level of health workers, especially about the rapid progression to severe illness, treatment protocols, and the availability of effective drugs could also alter the duration of hospital stay. the relatively longer time interval between hospital admission and death in private hospitals, despite the higher proportion of deceased patients with comorbidities and presence of multiple concurrent symptoms could be due to the availability advanced technology and facilities to provide advanced life-supporting critical care, the affordability of those seeking health care and less burden of patients especially for covid-19 management as compared to public hospitals studies reported an average incubation period of 4 to 7 days, which is the time interval between the exposure/infection and onset of symptoms (4, 22, 29) . considering an average of 5 days between the exposure/infection and onset of symptoms, the estimated time interval between exposure/infection and death is 13 days in the state, whereas the time interval between symptom onset and death is just 8 days. two studies have reported a higher average time interval of 17.8 and 18.5 days between the first recorded symptoms and death (10, 30) , as compared to our study findings. j o u r n a l p r e -p r o o f the study provides evidence from india, emphasizing that the elderly, male and people living in densely populated areas, and patients with underlying comorbidities die disproportionately due to covid-19. the study estimated a time interval of 13 days between exposure to infection and death, with 4 days each for symptoms onset to hospital admission; and admission to death. none of the potential factors significantly alter the time interval between the onset of symptoms and hospital admission and death, except the type of treating hospital. the shorter time interval between the onset of symptoms and hospital admission would be critical as early diagnosis, supportive care and treatment could substantially reduce the mortality especially among the elderly and vulnerable population. however, the shorter time interval between admission and death is a concern and the possible reasons must be elucidated and addressed. essentially, as the number of deaths from covid-19 continues to increase, early diagnosis and timely treatment for moderate and severe cases are of crucial importance to reduce mortality. the analysis is based on data available in the public domain and it is limited to only deaths that were reported in the state, through hospital admission. hospital care can vary from general ward care to intensive care and we do not have disaggregated data for this. the j o u r n a l p r e -p r o o f -1,678 j o u r n a l p r e -p r o o f government of india covid-19 active cases in tamil nadu chennai: health & family welfare department, government of tamil nadu comorbidity and its impact on 1590 patients with covid-19 in china: a nationwide analysis medical vulnerability of young adults to severe covid-19 illness—data from the national health interview survey evaluation and treatment coronavirus (covid-19): in: statpearls duration of symptom onset to hospital admission and admission to discharge or death in sars in mainland china: a descriptive study rapid progression to acute respiratory distress syndrome: review of current understanding of critical illness from covid-19 infection why does covid-19 disproportionately affect older people? aging (albany ny) estimates of the severity of coronavirus disease 2019: a model-based analysis. the lancet infectious diseases daily report on public health measures taken for covid-19 chennai: directorate of public health and preventive medicine, health and family welfare department, government of tamil nadu molecular mechanisms of sex bias differences in covid-19 mortality national family health survey-2017 (nfhs-4) new delhi: ministry of heallth and family welfare,government of india clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study comorbid chronic diseases are strongly correlated with disease severity among covid-19 patients: a systematic review and meta-analysis covid-19) -united states clinical features of 85 fatal cases of covid-19 from wuhan. a retrospective observational study chronic heart diseases as the most prevalent comorbidities among deaths by covid-19 in brazil korean society of infectious d, korea centers for disease c, prevention. analysis on 54 mortality cases of coronavirus disease 2019 in the republic of korea from comorbidity and its impact on patients with covid-19 the prevalence of symptoms in 24 covid-19): a systematic review and meta-analysis of 148 studies from 9 countries clinical progression of patients with covid-19 in shanghai investigation of three clusters of covid-19 in singapore: implications for surveillance and response measures 30-day mortality in patients hospitalized with covid-19 during the first wave of the italian epidemic: a prospective cohort study clinical course and outcomes of critically ill patients with sars-cov-2 pneumonia in wuhan, china: a single-centered, retrospective, observational study clinical characteristics and outcomes of patients undergoing surgeries during the incubation period of covid-19 infection predictors of mortality for patients with covid-19 pneumonia caused by sars-cov-2: a prospective cohort study covid-19 length of hospital stay: a systematic review and data synthesis the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study. the lancet the authors declare that they have no conflict of interest. this research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. not required j o u r n a l p r e -p r o o f key: cord-249569-78zstcag authors: kim, t.; lieberman, b.; luta, g.; pena, e. title: prediction regions for poisson and over-dispersed poisson regression models with applications to forecasting number of deaths during the covid-19 pandemic date: 2020-07-04 journal: nan doi: nan sha: doc_id: 249569 cord_uid: 78zstcag motivated by the current coronavirus disease (covid-19) pandemic, which is due to the sars-cov-2 virus, and the important problem of forecasting daily deaths and cumulative deaths, this paper examines the construction of prediction regions or intervals under the poisson regression model and for an over-dispersed poisson regression model. for the poisson regression model, several prediction regions are developed and their performance are compared through simulation studies. the methods are applied to the problem of forecasting daily and cumulative deaths in the united states (us) due to covid-19. to examine their performance relative to what actually happened, daily deaths data until may 15th were used to forecast cumulative deaths by june 1st. it was observed that there is over-dispersion in the observed data relative to the poisson regression model. an over-dispersed poisson regression model is therefore proposed. this new model builds on frailty ideas in survival analysis and over-dispersion is quantified through an additional parameter. the poisson regression model is a hidden model in this over-dispersed poisson regression model and obtains as a limiting case when the over-dispersion parameter increases to infinity. a prediction region for the cumulative number of us deaths due to covid-19 by july 16th, given the data until july 2nd, is presented. finally, the paper discusses limitations of proposed procedures and mentions open research problems, as well as the dangers and pitfalls when forecasting on a long horizon, with focus on this pandemic where events, both foreseen and unforeseen, could have huge impacts on point predictions and prediction regions. the current coronavirus disease (covid-19) pandemic [12] , caused by the sars-cov-2 virus, is providing statisticians, data scientists, machine learners, and other modelers a real-time laboratory to test and demonstrate their forecasting skills and abilities, with the quality of their forecasts assessable in a matter of days, weeks, or months. see, for instance, https://covid19-projections.com from the masachussetts institute of technology (mit) and the institute of health metrics (ihme)'s https://covid19.healthdata.org/united-states-of-america based at the university of washington in seattle, as well as [15] discussing the complexities of modeling pandemics. of particular interests are the forecasting of the numbers of daily cases 1 , deaths, and hospitalizations, or the cumulative cases, deaths, and hospitalizations attributable to covid-19 at a future date in a specified country or a locality (e.g., a county, state, or province) on the basis of currently observed cases, deaths, and hospitalizations data. such forecasts are of critical importance since they are major components in the decision-making process by government officials, business leaders, and educational and university administrators regarding the termination of lockdowns, lessening of social distancing and other mitigation regulations, opening of businesses, or continuing with online class formats in k-12 schools, colleges, and universities. the left panel of figure 1 provides the daily number of reported deaths due to covid-19 for the united states (us) with respect to the number of days since december 31, 2019 until may 15, 2020, which is day 137 in the figures, as reported by the european center for disease control (ecdc) [11] [see section a.1]. for a given date/day, including weekends, in the data set, the numbers reported are from the preceding day, which is due to a processing lag in reporting. the right panel of figure 1 is a scatterplot of the cumulative number of deaths in the us due to covid-19. given these daily and cumulative deaths data set, it is of interest to forecast the number of cumulative deaths in the us by, say, may 25, 2020 (corresponding to day 147), which is memorial day, and to ask whether by that day the cumulative number of deaths in the us due to covid-19 will have surpassed the ominously depressing and grim milestone of 100,000 cumulative deaths. later, for our illustration, we will consider the problem of forecasting the cumulative number of deaths in the us due to covid-19 at the end of may 2020, and compare our forecast with what eventually occurred. finally, we attempt to forecast the cumulative number of deaths by july 16th based on the data on july 2nd. such forecasting problems are clearly non-trivial since there is the distinct possibility that whatever model we had fitted in the observed time-frame may not apply to the time period under forecast, the ever-present danger and risk of extrapolation. aside from the fitted model most likely not being the true data generating model -recalling the aphorism attributed to george e. p. box [7] that all models are wrong, but some are useful -there are other factors, some beyond our control, that could impact the number of reported deaths at a future time, such as premature easing of social distancing and re-opening of business establishments, virus mutations, better diagnostic tools, changing hotspots, overburdened health care facilities, introduction of effective treatments, beneficial or detrimental actions by local, state, and/or federal entities, changing definition deaths cumulative deaths due to covid-19, under-or over-reporting of deaths, timely development of a vaccine, protests and riots arising from social unrests, and others. but high-level decision-makers such as government officials, business leaders, educational administrators, and society itself, demand some beacon, however dim such beacons may be, to guide them in their decision-making. statisticians, data scientists, machine learners, and other modelers are always ready and willing to provide such beacons. this paper is in this spirit. we will examine existing methods and develop new methods for constructing prediction regions for random variables that pertain to the number of occurrences of an event of interest. a prediction region contains more information compared to just a point prediction since it provides information about the uncertainty inherent in the prediction. note that with a prediction region we are interested in the would-be realized value of a random variable, not the value of a parameter, hence instead of referring to it as a confidence region, it is instead called a prediction region. the events of particular interest are those that are 'rare' in the sense that, informally, the probability of an event occurring in an infinitesimal interval is also infinitesimal. consequently, our starting point will be the poisson distribution which is a model for the number of occurrences of a rare event, and transition to the more general poisson regression model, and eventually to an over-dispersed poisson regression model which turns out to be a better model in the covid-19 application. the real-life and practical application for which our methods will be applied is the construction of prediction regions for the daily and cumulative number of deaths due to covid-19 in the us for a future date given only the daily deaths data until a current date. note that such predictions or forecasts could probably be improved by utilizing other information (such as the capacities of health care facilities; movements of people in a region; information about sensitivity and specificity of diagnostic tests; transmission rates (r 0 ) of the virus; and others), or via stratification by states, cities, or counties and then combining the results from these strata to obtain a point prediction and a prediction region for the whole us. however, we approach the construction of the prediction regions for the daily and cumulative deaths at a future date by just utilizing the observed reported daily deaths data for the whole us, which in a sense is the most reliable available data regarding this covid-19 pandemic. it might be possible to utilize information about the number of cases or infected individuals, which is also reported daily, but we feel that this is not a reliable information since it is highly dependent on the number of tests that are performed and on the sensitivity and specificity of the diagnostic tests used. in addition, if such information is to be used in the prediction model, then we may not have their realized values at the future date on which the prediction region is desired. we point out that even though we are employing probabilistic models in the form of the poisson or an over-dispersed poisson model, which are derivable from intuitive conditions when dealing with rare events (cf., [23, 16] ), our prediction method is still purely data-driven being only reliant on the observed data. the occurrence of a death due to covid-19 could still be considered as a rare event when viewed in the context of the whole population, though even if it is rare, deaths are still significant and dire events. this is because to die of covid-19, generally one first needs to get infected, which at this point is still a rare event, and then having been infected, to die from it. the rate of dying when infected with covid-19, if not age-adjusted, is still rather low, less than 2% (see, for instance, coronavirus (covid-19) mortality rate). because of its rarity, a plausible probability model for the number of deaths due to covid-19 is therefore the poisson model whose probability mass function (pmf) is given by with z 0,+ = {0, 1, 2, . . .}, and λ > 0 is the rate parameter, which is also the mean and variance of the distribution, and i{·} is the indicator function. for a variable y with this poisson distribution, we write y ∼ p oi(λ). the cumulative distribution function of a p oi(λ) is we start our investigations with this no-covariate poisson model, equivalently, a model with intercept only, since results for the poisson regression model build on this no-covariate model. suppose now that y 0 ∼ p oi(λ), where for the moment we assume that we know the rate parameter note that this region will not be an interval being a subset of z 0,+ , though if this region is formed as the intersection between z 0,+ and an interval in , then we may call it imprecisely as an interval. subject to this condition, a desirable property of such a region is that its cardinality is as small as possible. if we allow for γ(λ, α) to depend on a randomizer u , a standard uniform random variable independent of y 0 , the smallest cardinality 100(1 − α)% prediction region is, using a neyman-pearson lemma type argument, given by where, for d ∈ , we define the subsets of z 0,+ given by and c(α) and γ(α) determined via with 0/0 = 0. observe that by allowing randomized prediction regions, we have if we do not admit randomized prediction regions, which is achieved by always taking u = 0 in γ 0 (u ; λ, α), then unless 1 − α is a 'natural' prediction coefficient, we will not achieve equality in the preceding probability statement. the use of the adjective 'natural' is analogous to its use in constructing nonparametric confidence intervals, cf., [21] . see [19] on the application of neyman-pearson-type arguments to construct optimal confidence regions, which could be adapted to the construction of prediction regions. there are two other ways of constructing prediction intervals for y 0 when λ is large using normal approximations. to obtain the prediction regions, these intervals are then intersected with z 0,+ . letting n (µ, σ 2 ) denote a normal distribution with mean µ and variance σ 2 , we recall that when λ is large owing to the central limit theorem and the delta method (cf., [8] ), then we have the normal approximations let φ(·) and φ(·) be the probability density and cumulative distribution functions of a standard normal random variable so that let z α = φ −1 (1 − α) be its (1 − α)th quantile. two approximate prediction regions for y 0 when λ is large, which are based on the above normal approximations, are given by when λ is large, as noted in the construction of γ 1 , we may approximate the poisson probabilities by normal probabilities, via as such, we obtain the approximation consequently, when λ is large, the regions γ 0 (u ; λ, α) and γ 1 (λ, α) should be close to each other. for these prediction regions γ 0r (randomized), γ 0n (nonrandomized), γ 1 , and γ 2 , the exact coverage probabilities and their exact lengths (mean length for γ 0r ) could be computed under the p oi(λ) distribution, since λ is known. note that the lengths, which are the differences between the upper and lower integer limits of the prediction regions, are equivalent surrogates of the cardinalities of the regions. figure 2 depicts the exact coverage probabilities (cp), expressed in percentages, and their lengths (expected length for γ 0r ), for different values of λ. except when λ takes small values where the coverage probabilities of γ 1 and γ 2 are degraded, especially for the latter, the performance of these prediction regions are quite similar. the coverage probability of γ 0r is exactly equal to 1 − α, whereas that for γ 0n is always at least equal to 1 − α. both γ 1 and γ 2 could have coverage probabilities that could be below the nominal coverage level, though as λ increases, these differences become negligible. by construction, γ 0r has a shorter interval than γ 0n ; for some values of λ, the length of γ 0r exceeds that of γ 1 and γ 2 , but this is because the coverage probabilities of γ 1 and γ 2 are lower than the nominal coverage level. but, in the preceding developments, we have assumed that the rate parameter λ is known, an unrealistic assumption. how do we deal with the situation when λ is unknown? suppose that we had observed a realization y = (y 1 , y 2 , . . . , y n ) of a random sample y = (y 1 , y 2 , . . . , y n ) from p oi(λ), so the components of y are independent and identically distributed (iid) from p oi(λ). our goal is to utilize y to construct a 100(1 − α)% prediction region for an unobserved y 0 , which is independent of y and whose distribution is also p oi(λ). how will we achieve our goal? note that through the sufficiency principle, we may reduce the problem by simply assuming that we had observed t = n i=1 y i , the realization of the sufficient statistic for λ given by t = n i=1 y i , which has a p oi(nλ) distribution. the reduced problem therefore is that we have (t, y 0 ) which are independent random variables with t ∼ p oi(nλ) and y 0 ∼ p oi(λ) and our goal is to construct a 100(1 − α)% prediction regionγ(t, u ; α) for y 0 , which utilizes t , and possibly a randomizer u which is independent of (t, y 0 ). given t = t, the maximum likelihood estimate (mle) of λ isλ(t) = t/n. by virtue of the consistency ofλ(t ) for λ as n → ∞, a seemingly straight-forward approach to constructing a prediction region for y 0 is to replace λ in γ 0 (u ; λ, α), γ 1 (λ, α), and γ 2 (λ, α) in (4), (5) , and (6), respectively, byλ(t) to obtainγ 0 (t, u ; α) = γ 0 (u ;λ(t ), α); how do these prediction regions compare with each other in terms of performance, both in the context of their coverage probabilities and also their cardinalities, whose surrogate are lengths? in particular, by substitutingλ(t ) = t /n for λ, how does this impact the coverage probabilities of these prediction regions and are they still valid, even in an asymptotic sense? it is not clear how the substitution of λ byλ(t ) = t /n will impact the exact performance of the first prediction regionγ 0 . however, for the second and third prediction regionsγ 1 andγ 2 , we could alter them to take into account the substitutions, provided that λ is large. as noted earlier, when λ is large,γ 0 ≈γ 1 , so the alteration ofγ 1 should also apply, approximately, toγ 0 . the change in distributions of the pivotal quantities arising from these substitutions are reflected below, a consequence of the delta-method. from these normal approximations, we could improve the prediction intervalsγ 1 andγ 2 into the following prediction intervals, which take into account the impact of these substitutions where, for notational economy, we writeλ forλ(t ) and '∨' for max; for the ceiling function; and for the floor function: note that by intersecting the intervals with z 0,+ , the floor and ceiling functions are actually not needed, but we retain them in the formula since when we consider the 'length', this pertains to the length of the interval. observe that if the lower limits of these intervals are not zeros, which will usually be the case for large λ, then it is a simple exercise to show that these two prediction intervals have the same lengths, but they are not identical regions. in trying to adapt the prediction region γ 0 (u ; λ, α) in (4) to the situation where λ is unknown, the main idea is to replace λ by an estimate obtained from the observed data. doing so leads to estimates of p(k|λ), k ∈ z 0,+ which are then used in determining c(α) and γ(α) in (4). thus,γ 0 in (7) is obtained by using the ml estimates of {p(k|λ), k = 0, 1, . . .} given by {p(k|λ(n, t)), k = 0, 1, . . .} withλ(n, t) = t/n. this begs the question on whether other possible estimates of {p(k|λ), k = 0, 1, . . .} could be utilized which may have better performances than the use of the ml estimates. an approach based on a second-order taylor expansion adjusts p(k|λ(n, t )) and leads to the approximation p(k|λ) ≈p 3 (k; (n, t)) ≡ p(k|λ(n, t)) by usingp 3 (k; (n, t)) in place of p(k|λ) in (4) results in the prediction region denoted byγ 3 (n, t; α). another intriguing possibility is to utilize the uniformly minimum variance unbiased estimator (umvue) (see [8] ) of p(k|λ), given the data (n, t ) with t ∼ p oi(nλ), or equivalently, y 1 , y 2 , . . . , y n which are iid p oi(λ). the umvue of p(k|λ), usually obtained through the rao-blackwell theorem and lehmann-scheffe theorem [8] , is given bŷ the binomial probability at k with parameters (t, 1/n). observe, however, that this approximation will lead to zero probabilities for k outside of the set {0, 1, . . . , t}. usingp 4 (k; (n, t)) in lieu of p(k|λ) in (4) leads to the prediction region denoted byγ 4 (n, t; α). as yet another idea is to develop a procedure by borrowing from the bayesian playbook [8] . we suppose that our prior knowledge of the value of the poisson rate λ is represented by a distribution function g. having observed y = y = (y 1 , y 2 , . . . , y n ), the posterior distribution of λ is given by with t = t(y) = n i=1 y i . the conditional probability mass function of y 0 , given y = y, also called the posterior predictive pmf, is if we are pure bayesians, then we will completely know, or trust, our g, so we could use the predictive pmf p(·|y; g) in lieu of the poisson pmf in (4) to form a bayesian prediction region for y 0 . usually, however, we may try to estimate g by aĝ(·; y) based on y. this brings us to the realm of the empirical bayes (eb) approach, pioneered by herbert robbins; see [24, 25, 26 ]. an extreme case is to 'estimate' g by a degenerate distribution at the ml estimateλ = t/n, which leads to just substitutingλ in the poisson pmf, hence results in the prediction regionγ 0 in (7) . another possibility is to try to estimate g non-parametrically. however, here we implement this bayesian and eb approaches using a family of conjugate priors, so we assume g is a gamma distribution with mean κ/β and variance κ/β 2 , denoted by g κ,β , whose density function is where κ > 0 and β > 0. note that κ is the shape parameter and β is the scale parameter. under g = g κ,β , simplifying (15) we obtain, for y 0 ∈ z 0,+ , when κ is a positive integer, the pmf in (16) corresponds to a negative binomial distribution with parameters κ + t and (β + n)/(β + n + 1). the pmf in (16) could be used in place of the poisson pmf in (4) to form a bayesian prediction region for y 0 , given (κ, β), denoted byγ 5 (u, y; (κ, β)). an approach to specifying (κ, β) is to specify a prior mean and prior standard deviation for λ, say m and s, respectively, which yield κ = m 2 /s 2 and β = m/s 2 . the eb approach estimates κ and β from the data y = (y 1 , y 2 , . . . , y n ). such estimation could be done via maximum likelihood using the likelihood function obtained from the joint marginal distribution of (y 1 , y 2 , . . . , y n ) based on the model y i |λ i ∼ p oi(λ i ) and λ i ∼ g κ,β . this likelihood function is given by a method-of-moments approach to estimating (κ, β) based on y fails, however, because negative estimates of κ and β are obtained when the sample variance of y is smaller than its sample mean. at this point we mention previous works dealing with prediction intervals under the poisson model. prediction interval methods for the poisson model have been incorporated in the r package envstats [17] . an object function in this package is predintpois dealing with construction of prediction intervals under the poisson model. it provides four options for the type of prediction interval to construct. the methods are based on procedures presented in [9, 13, 18] . the option normal.approx in predintpois coincides with the prediction regionγ 1 based on the normal approximation. in these earlier procedures, randomization was not utilized, hence generally conservative prediction intervals are obtained. other approaches for prediction interval construction under the poisson model, including poisson regression models, are based on bootstrapping and simulation techniques, hence are computationally-intensive [3] . to compare performance of the prediction regionsγ 0 (randomized version),γ 1 ,γ 2 ,γ 3 ,γ 4 , anď γ 5 with m = 50, s = 100, and for n ∈ {5, 10, 15, 20, 30, 50, 70, 100} and λ ∈ {1, 5, 15, 30, 50, 100, 200}, we performed simulation studies, with program codes in the r [20] environment, to determine the coverage probabilities and the lengths of the regions (recall that length is an equivalent surrogate for the cardinality of the regions since we took the ceiling and the floor of the lower and upper limits, respectively, for the intervals that leads toγ 1 andγ 2 ). for each combination of n and λ, 10000 simulation replications of the basic simulation experiment were performed. table 4 in the appendix presents the results on the coverage percentages, mean lengths of the prediction intervals, and standard deviations of the lengths for different values of λ. the basic simulation experiment is, for a fixed n and λ, to generate t ∼ p oi(nλ) and y 0 ∼ p oi(λ). the t variable could be viewed as . . , y n are iid from a p oi(λ), the prediction regions are then constructed based on the observed (n, t ), with prediction coefficient of 95%. note that sincě γ 5 is the bayes prediction region instead of the eb, we only needed the value of t = n i=1 y i , but if we also used the eb approach, then we would have needed the values of (y 1 , y 2 , . . . , y n ) to estimate (κ, β). after constructing the prediction regions, it is then determined if y 0 is contained in these regions. coverage percentage is the percentage out of the 10000 prediction regions that contain the y 0 ; mean (standard deviation) length is the average (standard deviation) of the lengths of the 10000 prediction intervals. figure 3 presents plots with respect to n of the coverage probabilities (cp) and mean lengths (ml) for λ ∈ {1, 5, 30, 100}. examining table 4 and figure 3 we observe that when λ = 1, the cp ofγ 2 is very poor and even deteriorates as n increases. the reason for this is that the realized y 0 tends to equal 0, but the square-root transformation has a tendency to shift to the right the prediction interval, hence the interval tends to miss y 0 . this result forγ 2 is consistent with the result when the rate λ is known. when n = 5,γ 3 andγ 4 have unacceptably lower cps compared to the nominal level, whileγ 0 also has cps which are below the nominal level, as well asγ 1 andγ 5 , though the last two regions have cps closer to the desired level. the length ofγ 0 tends to be shorter thanγ 1 andγ 5 . as n increases, the cps ofγ 3 andγ 4 get closer to the desired level, and their lengths tend to be a bit shorter thanγ 0 andγ 1 . when λ = 5, the cps ofγ 0 ,γ 2 ,γ 3 , andγ 4 are all below the nominal level, whereas forγ 1 andγ 5 , their cps exceed or are quite close to the nominal level, except when n = 5. as a consequence, they ended up having longer mean lengths. these behaviors continue to hold as λ was increased, but with the cps getting closer to the nominal level, especially as n increases. when n is small, the cps ofγ 3 andγ 4 are still appreciably lower than the nominal level. when λ is large,γ 1 ,γ 2 , andγ 5 almost have the same performance. summing up our observations from these simulation studies for this no-covariate or intercept only poisson model, in terms of adapting to the estimation of the unknown rate λ,γ 1 andγ 5 possess the best performance among these six prediction regions in terms of achieving the nominal level, but they also tend to be longer than the others. negative continuously-differentiable function ρ(·), called the inverse link function, such that this is the so-called poisson regression model and belongs to the class of generalized linear models or the class of non-linear models [22, 8] . if y 0 , given x = x 0 , has a poisson distribution with rate λ(x 0 ; θ), and θ is known, then we could construct prediction regions for y 0 according to the methods described in the first part of section 2 when λ was assumed known. when θ is not known, then there is a need to estimate it. let us therefore assume that we are able to observe the sample ) and with the y i s independent and the x i s fixed. we seek to construct a prediction region for y 0 associated with the covariate vector x 0 . first, we introduce the following functions: the log-likelihood function for θ, given {(y i , x i ), i = 1, 2, . . . , n}, is given by the associated score vector function is whereas, the observed fisher information matrix is, with x ⊗2 = x t x, thus, the expected fisher information matrix is the mle of θ based on {(y i , x i ), i = 1, 2, . . . , n}, denoted byθ, solves the equation this will usually be obtained through iterative procedures, such as the iterative newton-raphson method, with the iteration given by by the large-sample theory of ml estimation (cf., [8, 29] ), as n → ∞ and under regularity conditions on the sequence of covariate vectors x i , i = 1, 2, . . . , n, we have that a consistent estimator of i(θ) is i(θ). by the delta-method, it then follows that the ml estimator of λ(x 0 ; θ) satisfies, as n → ∞, where 'tr' means trace of a matrix. using this result and when λ 0 = ρ(x 0 θ) is large, we obtain the approximate distributions of relevant pivotal quantities for constructing prediction regions. we writeλ 0 for λ(x 0 ;θ) andψ 0 for ψ(x 0θ ). these pivotal quantities are: . . , n}, from these pivotal quantities, we are then able to obtain approximate prediction regions for y 0 given by: we could also have the prediction region based on γ 0 from section 2 given bỹ where we note that the dependence on x 0 and (y, x) is throughλ 0 . note, however, that we are simply plugging in the estimate of λ(x 0 ; θ), but without taking into consideration the variability inherent in the estimator λ(x 0 ;θ). a specific inverse link function ρ(·), which we will consider in the application to forecasting deaths in the us due to covid-19, is the exponential function ρ(w) = exp(w), so that for this special inverse link function, we obtain the simplifications for the score vector and information matrices functions given by we also mention the extension of the bayesian/eb approaches to constructing prediction regions in the regression setting. we suppose that the parameter θ in λ(x; θ) = ρ(xθ) takes values in a parameter space θ. the approach then proceeds by starting with a prior distribution π(·) on θ which quantifies our prior knowledge about θ. the posterior predictive distribution of y 0 , the response at x 0 , given the data (y, x) = {(y i , x i ), i = 1, 2, . . . , n}, is given by where where the product and the sum are taken over the index set associated with (y, x), so will be over {1, 2, . . . , n} for (y, x), and {0, 1, 2, . . . , n} for (y 0 , x 0 ). generally, there will be no family of conjugate prior distributions on θ with respect to the poisson regression model, so the function h will not be in a closed analytical form, so that it has to be computed numerically, for instance, using markov chain monte carlo (mcmc) algorithms. nevertheless, upon obtaining the posterior predictive distribution of y 0 given in (27), a prediction region is then obtained by using this pmf p(y 0 |x 0 , (y, x)) in lieu of the poisson pmf in (4), analogously to the development of the prediction regionγ 5 in the intercept only model. the prior distribution π will involve hyper-parameters, for example, if θ = p+1 , π could be specified to be a multivariate normal distribution with mean vector µ and covariance matrix σ, so (µ, σ) will be the hyper-parameters. for the bayesian, these hyper-parameters will be assigned values, unless an improper prior distribution (e.g., lebesgue measure), which need not involve unknown hyper-parameters, is adopted; whereas, for the empirical bayesian, these hyper-parameters will be estimated using the data (y, x). because of the need to approximate the posterior predictive pmf through numerical methods, these bayesian and eb approaches to constructing a prediction region for y 0 are clearly computationally-intensive, especially if used in a simulation study to investigate their properties, such as their coverage probabilities and their lengths. because of the need to specify a non-conjugate prior and its hyper-parameters and the need for intensive computations, these bayesian and eb procedures are not included in the illustrations, simulations, and applications. it is clear, though, that they are highly viable alternative procedures and should be further explored. we demonstrate these prediction regions, depicted as intervals in the plots, via the following experiment. we specify a sample size n and an order p. we then generate iid realizations w i , i = 1, 2, . . . , n, n + 1, from either a n (µ, σ 2 ) distribution or a standard uniform distribution, and form the covariate vectors x i = (1, w i , w 2 i , . . . , w p i ), i = 1, 2, . . . , n, n + 1. for a specified θ = (θ 0 , θ 1 , . . . , θ p ) t , the poisson rates are computed. the ith response y i is a realization of a random draw from a p oi(λ i ). the response vector is y = (y 1 , y 2 , . . . , y n , y n+1 ) t . the goal is to construct a prediction region for (25), (23) , and (24), respectively. the procedures were coded into r functions and these will be made available publicly in due time. we present the results pictorially via a scatterplot of {(w i , y i ), i = 1, 2, . . . , n, n + 1}. the realized value y 0 ≡ y n+1 of y 0 is highlighted and the three prediction intervals for y 0 are also plotted. included in the plot is the theoretical curve for the λ(x; θ) as a function of w and we also super-impose the fitted curve. prediction regions for y i at w i , i = 1, 2, . . . , n, are also depicted in the plot. , which all contained y 101 . two things to observe from this plot are (1) the prediction regionγ 2 was shifted to the right relative toγ 1 , and (2) the prediction regionsγ 0 with respect to the w-values are scissorlike or jagged. the latter is a consequence of the randomization approach in the construction of the prediction regions and this non-smooth behavior becomes more apparent since the realized values of the y i s are small. the third realization, depicted in figure 6 , is from a model with n = 200, p = 3, with w i ∼ n (µ = 1, σ = 2), and with θ = (3, .2, −.1, −.05). the rate curve λ(w) as a function of w goes to zero as w increases, but goes to ∞ as w decreases, with a local minimum and maximum close to w = −2 and w = 1, respectively. the target of the prediction regions was in each of these illustrative realizations, the three prediction regions did not vary much from each other in terms of their sizes, except in the first case whereγ 0 was shorter but barely covered gamma2 target the value being predicted. a question that now arises is how their coverage probabilities and their mean lengths compare with each other? to gain some insights into these comparisons, we performed simulation studies under the four different models described above, with each simulation run having 10000 replications. the sample sizes considered were n ∈ {30, 50, 100, 200}. in the appendix, tables 11, 12, 13 and 14 summarize the results of these simulations where we report the coverage probabilities (cp), mean lengths (ml), and standard deviation of lengths (sl). examining these tables, it appears thatγ 0 has coverage probabilities that are below the nominal level (between 3% and 4% below in table 14 when n = 30), with the discrepancy being more pronounced when the sample size is small. as the sample size is increased, these observed coverage probabilities get closer to the nominal level. this deficiency is due to the estimation of the θ parameter and, as previously noted, theγ 0 does not take into consideration the variability in the resulting estimator of λ(x 0 ; θ). on the other hand,γ 1 andγ 2 both achieve coverage probabilities that are quite close to the nominal level, especiallyγ 1 , when n is large.γ 0 , on the other hand, tends to have a lower mean length compared to the mean lengths ofγ 1 andγ 2 , with the differences in mean lengths becoming alarmingly large for the model in table 13 . recall that for this model, the rate curve increases to ∞ as w decreases to −∞, and since the w i 's are generated from a normal distribution, on some occasions, w n+1 falls outside the range of {w i , i = 1, 2, . . . , n}. depending on how different w n+1 is from the mean of w 1 , w 2 , . . . , w n , this could lead to a large estimate of the standard error ofλ n+1 , gamma2 target thus leading to very wide prediction regions forγ 1 andγ 2 . sinceγ 0 simply utilized the estimate ofλ n+1 , but was totally oblivious to its variability, it was not much affected in such a situation. however, because of its rigidness with respect to this added variability, it could dramatically suffer. we demonstrate this situation by plotting an extreme realization in figure this particular demonstration warns us of the danger and pitfalls of making a prediction for a response variable that is associated with a covariate vector outside the convex hull of the covariate vectors used in the construction of the prediction regions and when the poisson rate hyper-surface generated by the map x → ρ(xθ) is complex. as a word of caution, when performing extrapolation to do predictions, be forewarned of sinkholes littering the forecasting road -and, if it could be avoided, make no forecasts on long, especially very long, horizons. but, alas, this is the type of forecasting problem that is actually realistic and of most interest, such as that of predicting the number of cases or deaths due to covid-19 in a future date, given the observed data up to a gamma2 target certain date. based on the results of these simulation studies, the prediction regionγ 1 appears to be the most preferable among the three prediction regions. in our illustration using the covid-19 data set in section 4, we will therefore just present the prediction region provided byγ 1 . we now present in this section an illustration of the potential application of the procedures discussed in the preceding sections. one of the interesting questions during this covid-19 pandemic is the forecasting of the number of cumulative deaths in the us at a given date, for example, at the end of may 31, 2020, given information up to a certain date, say may 15, 2020. such forecasts are of critical importance since they could partly be the basis of highly consequential and possibly controversial decisions by federal, state, and local governments officials, school administrators, executives of big corporations and small businesses, religious leaders, and many others. such decisions could pertain to when to institute stay-in-place directives, when to issue social distancing or social easing guidelines, when to open business establishments, when to open public places such as shopping malls and ocean beaches, when to allow religious gatherings, etc. data for daily deaths and cumulative [11] . clearly, the sequence of cumulative deaths does not satisfy the independence assumption, so a non-homogeneous poisson process model [23] is not an appropriate model for cumulative deaths when viewed as a continuous-time stochastic process. however, a non-homogeneous poisson process could plausibly model the occurrences of deaths in continuous-time, from which it follows that the sequence of daily deaths will be independently poisson distributed with possibly different rates depending on the number of days from the time origin and the specific day of the week, as well as other features such as, for example, the quality of the health care facilities, which is hard to quantify and not available in the european cdc data set. our novel idea therefore is to utilize poisson regression to predict the number of daily deaths according to the methods developed earlier, and then to aggregate these daily forecasts to obtain forecasts of the cumulative deaths. we will use the data set for the us provided by the european cdc ( [11] ) plotted in figure 1 which are the observed numbers of daily deaths attributed to covid-19 starting on march 1, 2020, the day after the first reported death due to covid-19, until may 15, 2020. note that, technically, this will be the deaths data at the end of may 14, 2020. using this data set gamma2 target on may 15th, and given the cumulative number of deaths until then, the goal is to forecast the cumulative number of deaths in the us due to covid-19 by the end of may 31, 2020, that is, june 1, 2020. we limit our illustration to simply utilizing the variable daynum, which is the number of days starting from december 31, 2019, day which the day of the week, and deaths, the variable representing the daily number of deaths. we surmise that the deaths data set is the most reliable among the data sets that were compiled, compared, for instance, to the data set pertaining to the number of cases or infected people. however, the deaths data set need not also be totally reliable and could be subject to misclassification error and competing causes of deaths., for example, a patient who contracted covid-19 who dies primarily because of pneumonia may be classified as having died of covid-19, but could also be classified as having died, not of covid-19, but of pneumonia. see also, for instance, the wsj article [14] , [2] , and the bbc news article https://www.bbc.com/news/world-53073046 [10] , the last two discussing the notion of "excess deaths," which are deaths that may have been due to the pandemic, but which are not included in the reported covid-19 deaths data set. certainly, we could have used other information such as the number of reported cases; by performing separate forecasts in each of the 50 states and the district of columbia, then aggregating; or even by utilizing counties or metropolitan cities as strata, and then combining forecasts from these strata to obtain an overall forecast for the whole us. however, for illustrative purposes, we decided to keep things simple. = 137) , consisting of 76 days, we have the daily number of deaths, hence also the cumulative number of deaths. the other variable used in the modeling is day (e.g., sunday, monday, etc.) associated with each value of daynum, which is a categorical or factor variable. we fitted a poisson regression model using the glm function in r with a log-link, with response being y = deaths and covariate vector x = (1, w, w 2 , w 3 , w 4 , w 5 , day), where w = daynum, and day is considered as a factor variable hence is converted into six, instead of seven since we already have an intercept term, dummy variables in the design matrix. we chose this 5th-order model with respect to daynum since the akaike information criterion (aic) values, computed under the poisson regression model, appear to stabilize starting at this model and adhering to the law of parsimony (occam's razor). the aic values associated with the 8thand 9th-order models were actually smaller than for the 5th-order model; however, these models possess highly unstable predicted values. table 1 summarizes the aic values for the different models, both without and with day in the model. it also contains the estimates of ξ, the overdispersion parameter in a model that will be introduced shortly, and as we will then see, larger values of ξ are indicative of the poisson regression model becoming a more adequate model. the histogram and time plot of the residuals with respect to daynum are provided in figure 9 . recall that the ith residual in poisson regression is defined as whereλ i is the fitted value associated with x i . as such, if the poisson regression model is adequate, one should see a histogram similar to that associated with a centered (at zero) poisson distribution with unit rate, but this will just be an approximation since the rates are estimated, hence that affects the distribution of the residual. similarly, the time plot of the residuals should be randomly distributed on the zero horizontal line. the histogram and time plot in figure 9 , with the time thus, the reported number on april 16th of 4928, the highest number of daily deaths reported, is an outlier explained by the adjustment made. however, we still included these perceived outliers in the fitting of the fifth-order, with respect to daynum, poisson regression model. later, when we consider forecasting for july 15th and august 1st, and since another significant adjustment was made on june 26th, we will re-allocate each of the adjustments proportionately to the observed deaths on the days on or before the adjustment day. based on the fitted model's residuals, we further assessed the independence assumption of the daily deaths. we do this by creating a contingency table for daynum with six intervals and with residuals being either negative or positive and then performing a test for independence. the observed contingency table is presented in table 2 . a chi-square test for independence based on this table yielded χ 2 c = 4.4685 on 5 degrees-of-freedom, with associated p-value of 0.4841, hence the null hypothesis of independence cannot be rejected. observe, however, that the fit of the model in the early days is not satisfactory, and between daynum 99 to 112, there was a preponderance of negative residuals, possibly owing to the influence of the adjusted reported daily deaths on daynum 108. it may appear surprising and counter intuitive to include a day effect in the model, since one recall that our main objective is to obtain a prediction region for the cumulative number of deaths at a specified date, in our case june 1, 2020 (daynum = 154). this means we were predicting the cumulative deaths at the end of may 31, 2020. from daynum = 138 to daynum = 154, there were a total of 17 days. if we denote by y j , j = 62, 63, . . . , 154, the random variable denoting the daily number of deaths for daynum = j, the random variable denoting the cumulative number of deaths until daynum = k is s k = k j=62 y k . thus, we are seeking a prediction region for s 154 , given that s 137 = 85906. under the fitted poisson regression model, y j , given with w j = daynum j and d kj , k = 1, 2, . . . , 6, the dummy variables representing whether day is a tuesday, a wednesday, a thursday, a friday, a saturday, or a sunday, respectively, has poisson distribution with rate λ(x j ; θ) = exp {x j θ} . we mention that in our r code for fitting this model, for computational stability, we first centered and standardized the non-constant columns of the x j 's. let α * ∈ (0, 1), and the y j s being independent. from the preceding section we know how to construct a 100(1 − α * )% prediction interval [a j , b j ] for y j , where, as mentioned earlier, we will simply utilize the prediction regionγ 1 . by the independence, we will have thus, if we wanted s 137 +[a • , b • ] to be a 100(1−α)% prediction interval for s 154 , given s 137 , we could choose α * = 1 − (1 − α) 1/17 . this procedure will guarantee a conservative 100(1 − α)% prediction interval for s 154 , given s 137 . this is the approach we followed in constructing a (conservative) 95% prediction interval for s 154 , the cumulative number of deaths due to covid-19 in the us by the end of may 31, 2020. we implemented the above procedure and also constructed the prediction intervals at each of the observed daynum which are depicted in figure 10 . examining this figure note that there are more observed deaths outside the prediction curves than what is expected nominally. this indicates that either there is more variability inherent in the stochastic mechanism generating the observed number of daily deaths relative to a purely poisson regression model, or the fifth-order poisson rate model is still inadequate, or both. we propose an approach that introduces over-dispersion with the poisson regression model serving as a hidden model. we mention that our occam's razor-type solution is motivated by frailty modeling in survival analysis (see, for instance, [5] ). our model assumes the existence of an unobserved positive latent variable z j of mean 1 at w j = daynum j , and the reported number of deaths y j is the integer part of z j y * j , with {y * j } arising from a poisson regression model and with z j and y * j independent. recall that frailty models in survival analysis are used specifically to model correlations among observations; whereas, in our model it serves as an unobserved random contamination component in the observed number of daily deaths. in our implementation, we shall take z j to have a gamma distribution with mean one and variance 1/ξ (see [8] ). this ξ is then an additional parameter in the regression model aside from the parameter vector θ. such a model leads to an over-dispersed poisson regression model, with the purely poisson regression model embedded in this model and obtainable as a limiting case when ξ → ∞. inference for such a model requires further study, with possible use of an em-type algorithm, though this could be difficult to implement since the distributions of the y i s are not in closed forms. however, we may implement a z-estimation approach (see, for instance, [29] ). we first note that the approximate higher-order moments of the y i s are also obtainable. based on these moments, we could form the set of estimating equations, where we recall that ρ(w) = exp(w): by z-estimation theory ( [29] ) and under regularity conditions, it will follow that, for some (p + 1) × (p + 1) matrix in fact, let us introduce the (p + 1) × 1 vector functions: denote by h the (p + 1) × (p + 1) matrix function consisting of the derivatives of u with respect to (θ, ξ). the components of this matrix function are: x ⊗2 i ρ(x i θ) and h 12 ((y, x); θ, ξ) = 0; we could then obtain the estimates via the newton-raphson (nr) method with iteration step it turns out that a simpler way to obtain the estimates of θ and ξ is to first obtain the estimatẽ θ of θ from the first estimating equation. this could be done by using the glm object function in r [20] with the poisson family and logarithm link. the estimateξ of ξ is then obtained from the second estimating equation using a one-variable nr iteration with θ replaced byθ. define the (p + 1) × (p + 1) matrices σ and ω according to with plim denoting "in-probability limit" as n → ∞. then, the asymptotic covariance matrix of (θ t ,ξ) t is we note that another way of obtaining estimates of σ and ω is to obtain their theoretical expressions using higher-order moments of the y j s. these expressions depend on θ and ξ, so estimates could then be obtained by replacing θ and ξ byθ andξ. we also point out that ξ 11 is generally not equal to σ −1 11 when ξ is finite. in fact, it is imperative that ξ 11 should be used instead of σ −1 11 since it takes into account the impact of the estimation of ξ. applying the delta-method [29] , we then have the pivotal quantity result, as n → ∞, given by where quantities with 'ˆ' are estimates obtained by plugging inθ andξ for θ and ξ in their respective expressions. from this pivotal quantity, it follows that an approximate 100(1 − α)% prediction interval for y 0 is given by because of the term 1/ξ, when ξ is small, this prediction interval will be wider than the prediction intervals under the purely poisson regression model. as ξ → ∞, which makes the model approach the poisson regression model, then this prediction interval will approachγ 1 . implementing this procedure based on this over-dispersed poisson regression model, we first examine the prediction intervals at each of the observed daynum values, which are shown in figure 11 . we now see that these approximate 95% prediction intervals cover most of the observed daily deaths. this indicates that the over-dispersed poisson regression model provides a better fit to the observed daily deaths data than the purely poisson regression model whose approximate 95% prediction intervals are shown in figure 10 . the estimate of ξ turned out to beξ = 16.89016. the daily deaths left-panel of figure 12 is the scatterplot of daily deaths, but now including the actual observed values after daynum 137 and until 154 (these are the red dots) and the prediction intervals for the daily deaths past 137. observe that the prediction intervals for daynum between 62 and 137 are wider than those in figure 11 and the reason for this is the prediction coefficient used is now adjusted for the goal of constructing a prediction interval for s 154 . in this case, α * = 0.003012. the right-panel of figure 12 displays the prediction interval for s 154 , given s 137 ; in fact, this also displays the prediction intervals for s k , k = 138, 139, . . . , 153, given s 137 . the red dots are the actual observed values past daynum equal to 137. the predicted cumulative deaths on daynum = 154 wasŝ 154 = 96876 and the conservative prediction interval for s 154 was [86157, 118323] . in both of these plots, notice that the 5th-order prediction model did not perform well past daynum = 150, though the prediction interval for s 154 did cover what was actually observed, which was 104383. from figure 12 , an elevated number of daily deaths occurred starting at daynum = 150 (may 28). the grim milestone of 100000 cumulative deaths due to covid-19 in the us was also surpassed on this day. to partly assess the sensitivity of the procedure, if we had used the data until daynum = 145 (may 23rd), the predicted value for s 154 isŝ 154 = 101010 and the conservative 95% prediction interval, given s 145 , is [96567, 106057]. the associated plots in this case are provided in figure 13 . on the other hand, if we had used the data until june 1, 2020 (daynum = 154), the fitted value is 104383, which coincided with the observed cumulative number of deaths. that the fitted cumulative number of deaths and the observed cumulative number of deaths on the last day were equal is actually a consequence of the estimating equation, hence in hindsight is not a surprising result. the associated plots in this case are provided in figure 14 . observe in the right-panel of figure 14 that the model for the cumulative deaths based on this 5th-order model is quite excellent, lending support to our novel approach of modeling the daily deaths data, instead of the cumulative deaths data, for the purpose of making predictions for the cumulative deaths. we summarize the results of this sensitivity analysis in table 3 and figure 15 , where we report the predicted values and the prediction intervals for s 154 under scenarios where data used in the model fitting is up to the different days from daynum equal to 137 up to 153. based on this analysis, the fifth-order model appears to possess stability since the predictions and the prediction intervals for s 154 remain somewhat consistent as the amount of data being used in the model fitting varies. as to be expected, note that as the forecasting horizon shortens, then the prediction interval also narrows. this is the case, even though we still considered the outlier on april 16th, which was the adjusted deaths data point, as a legitimate observation. however, any model, especially higher-order models, will have a breakdown point in the sense of yielding seemingly unreasonable predictions, perhaps due to a long forecasting horizon, insufficient amount of data, or wildly changing data points drastically altering estimates which highly impact forecasts. for this 5th-order model, it appears to break down when the data used is on or before daynum 132. one possible cause appears to be sharp increases and decreases in the observed daily deaths. for instance, on daynum = 120 the reported daily deaths was 1369, but on the next two days they were 2110 and 2611, and these led to a huge jump in the predicted value for s 154 . also, from daynum ranging from 125 to 133, the reported daily deaths were 1317, 1297, 1252, 2144, 2353, 2239, 1510, 1624, and 734, and the predictions were highly unstable and only started to stabilize after daynum = 132. this is a clear warning on the danger of fitting higher-order models or extrapolating with a long forecasting horizon. as the adage goes, attributed to niels bohr [28] , with similar versions attributed to mark twain, yogi berra, and others: it is difficult to make predictions, especially about the future. with dire thoughts of the inherent dangers of forecasting over a long horizon, and fully cognizant of the many eventualities (e.g., re-opening of economy; nationwide protests and riots due to police brutality; changing hotspots; adjustments on counts; etc.), which we are not taking into account in order to be purely data-driven, but which could drastically alter trajectories of daily and cumulative deaths, on the basis of the observed data up to july 2, 2020, in which the cumulative deaths 128062, we seek to forecast the cumulative deaths fifteen days forward, which will be july 16, 2020. we should mention that on june 26th, there was an adjustment of 1854 which occurred "following a state review of death certificates and prior outbreaks" in the state of new jersey [6] . together with the 3778 adjustment made by the state of new york on april 16th [1] , this is the second documented non-trivial adjustment made on the daily deaths counts. we provide two point predictions and prediction regions for the target date: -the first one based on considering the observed deaths values on april 16th and june 26th as legitimate values, and the second one based on re-allocating the adjustment values on those days proportional to the observed deaths on the days on or before the day of adjustment. without additional information, such a proportional re-allocation of the adjustment values appears to be most sensible, though this approach is not immune to criticism. figure 16 presents the point prediction and the prediction interval for july 16, 2020 based on these two analyses on with a 5th-order model. when the adjustment values are not re-allocated, the predictions and prediction intervals are depicted in the two top plots, whereas when they are re-allocated, they are in the two bottom plots. with no re-allocation, the point prediction is 143272 together with an associated prediction interval of [128062, 176957] for the cumulative deaths. with re-allocation, the point prediction is 146055 with an associated prediction interval of [128121, 185369] . observe that without re-allocation, the observed deaths of 2437 on june 26th fell outside of the prediction interval on that date, while the observed deaths of 4928 on april 16th barely fell inside the prediction interval on that date. notice the wider prediction intervals for deaths when no re-allocations were performed compared to those with re-allocations for the observed daynum values. observe also the widening prediction intervals as we go farther away from july 2nd, indicating high uncertainty on what may happen moving forward. interestingly, the prediction interval for the cumulative deaths on july16th when re-allocations were performed is wider than that without re-allocations, and the point prediction is also tad higher. ominously, notice that the prediction curve for deaths appears to be acquiring an increasing trend past july 2nd, in contrast to the decreasing trend from daynum 120 (april 28th) to 183 (june 30th). it remains to be seen if this is the effect of the lessening of social distancing guidelines, re-opening of business establishments and beaches, people gathering because of the current social unrest, or changing hotspots in the country. of course, in forecasting settings with new data points accruing frequently -daily in this covid-19 pandemic -forecasts should be updated as each new data point accrues. we intend to provide a publicly-accessible software applet to enable interested users to update forecasts with the latest updated data. motivated by the covid-19 pandemic, we examined the problem of constructing prediction regions for a poisson distributed random variable, both under the no-covariate (that is, intercept only) figure 16 : based on the available data until july 2, 2020, point predictions and prediction intervals of the deaths and cumulative deaths by july 16, 2020 (daynum = 199) based on an analyses where adjustment values were not re-allocated (top plots) and re-allocated (bottom plots). and with-covariate settings. we compared the performances of the different prediction regions through simulation studies. in the regression setting, we also introduced an over-dispersed poisson regression model upon observing over-dispersion in the covid-19 reported deaths data relative to a purely poisson regression model. with the ultimate goal of predicting cumulative deaths due to covid-19 at a future date, we first studied how to construct prediction intervals for the daily deaths data, and then utilized these prediction intervals to construct the prediction interval for the cumulative deaths. the final fitted models involved a 5th-order model in the variable daynum, and also included the factor variable day. based on data until july 2, 2020, prediction and prediction interval for the july 16, 2020 cumulative deaths were obtained. the methodologies developed have the potential to be used in the monitoring of daily and cumulative deaths during epidemics or pandemics through the construction of prediction regions, which could then be used by decision-makers regarding implementation of social distancing/easing guidelines and deciding on the closure/opening of business, educational, government, and other establishments. however, further studies are needed to compare our methodologies to other methods that have been proposed during this pandemic. the prediction and prediction region procedures we developed also possess limitations. first, in contrast to the susceptible-exposed-infected-recovered (seir) compartment model cf., [4] , based on a continuous-time markov chain, our model does not posit an upper bound to the number of people that could die, which clearly is not the case. second, especially since it involves higherorder terms in daynum, they are highly sensitive to outliers, such as when huge adjustments are made as in the cases for the states of new york and new jersey [14, 6] . it would be desirable to develop procedures that are robust to such non-trivial adjustments, or to procedures that impose constraints on the rate of increase or decrease of the prediction curve via regularization to curb the influential impact of such adjustments, though this will entail developing new theory for the construction of prediction regions. due to these first two limitations, the proposed methods are not suitable for use in long-horizon forecasting, hence our decision to simply forecast 15 days forward. third, the procedures are not adaptive in its choice of the prediction model. possible improvements may occur by choosing the prediction model in a data-dependent manner, but then model choice uncertainty needs to be accounted for in constructing prediction regions. fourth, there could be an advantage in utilizing other bases functions to transform the variable daynum, such as by using laguerre polynomials, legendre polynomials, trigonometric functions, or even splines or wavelets. these limitations of the proposed methods generate several potential research avenues for further studies. table 4 : simulated coverage probabilities, mean of the lengths, and standard deviation of the lengths of the prediction intervalsγ 0 andγ j , j = 1, 2, 3, 4, 5, for different λ's and n's. for each combination of (n, λ), 10000 replications were performed. for λ = 1. table 5 : table 4 continued. for λ = 5. n gam0cp gam1cp gam2cp gam3cp gam4cp gam5cp why is nyc reporting surge in virus deaths? tracking covid-19 excess deaths across countries. the economist prediction intervals for poisson regression a primer on stochastic epidemic models: formulation, numerical simulation, and analysis statistical models based on counting processes nj says 1,854 additional residents likely died of coronavirus after review of death records statistical inference. the wadsworth & brooks/cole statistics/probability series theoretical statistics coronavirus: what is the true death toll of the pandemic? european centre for disease prevention and control: an agency of the european union coronavirus disease (covid-19) pandemic: increased transmission in the eu/eea and the uk -seventh update statistical methods for groundwater monitoring why new york's coronavirus death count jumped: the stories of patients who died at home covid-19 pandemic modeling fraught with uncertainties sojourning with the homogeneous poisson process envstats: an r package for environmental statistics applied life data analysis median confidence regions in a nonparametric model r: a language and environment for statistical computing. r foundation for statistical computing introduction to the theory of nonparametric statistics linear models in statistics adventures in stochastic processes an empirical bayes approach to statistics an empirical bayes estimation problem some thoughts on empirical bayes estimation standardized surveillance case definition and national notification for 2019 novel coronavirus disease (covid-19) adventures of a mathematician of cambridge series in statistical and probabilistic mathematics a. 2 tables of simulation results key: cord-337763-kusqyumn authors: alves, t. h. e.; souza, t. a. d.; silva, s. d. a.; ramos, n. a.; oliveira, s. v. d. title: underreporting of death by covid-19 in brazil's second most populous state date: 2020-05-23 journal: nan doi: 10.1101/2020.05.20.20108415 sha: doc_id: 337763 cord_uid: kusqyumn the covid-19 pandemic brings to light the reality of the brazilian health system. the underreporting of covid-19 deaths in the state of minas gerais (mg), where is concentrated the second largest population of the country, reveals government unpreparedness, as there is a low capacity of testing in the population, which prevents the real understanding of the general panorama of sars-cov-2 dissemination. the goals of this research are to analyze the causes of deaths in the different brazilian government databases (arpen and sinan) and to assess whether there are sub-records shown by the unexpected increase in the frequency of deaths from causes clinically similar to covid-19. a descriptive and quantitative analysis of the number of covid-19 deaths and similar causes was made in different databases. ours results demonstrate that the different official sources had a discrepancy of 209.23% between these data referring to the same period. there was also a 648.61% increase in sars deaths in 2020, when compared to the average of previous years. finally, it was shown that there was an increase in the rate of pneumonia and respiratory insufficiency (ri) by 5.36% and 5.72%, respectively. in conclusion, there is an underreporting of covid-19 deaths in mg due to the unexplained excess of sars deaths, respiratory insufficiency and pneumonia compared to previous years. respiratory syndrome (sars) that emerged in december 2019 in china and became a pandemic in march 2020 due to its high infection and mortality rates [1] [2] [3] . covid-19 was the official name given by the world health organization (who) to the disease caused by the new coronavirus of 2019(sars-cov-2) [1] . the first epicenter of covid-19 was observed in wuhan, the capital of hubei, china, in december 2019 based on the several pneumonia cases notifications [4] . since then, covid-19 has rapidly spread around the world and, as of may 12th 2020, more than 4.4 million cases of the disease have been confirmed, causing over 299,000 deaths worldwide. [5] . of this total, brazil has reported more than 188,000 cases and over 13,000 deaths, according of coronavírus brasil database [6] . covid-19 is classified according to the symptoms' severity. patients with the mild form (80% of the cases) present fever, dry cough, chills, malaise, muscle pain,and sore throat. patients with moderate form present fever, respiratory symptoms, and radiographic characteristics. severe patients (5% of the cases) manifest dyspnea (> 30 bpm), low oxygen saturation (<93%) and low pao2/fio2 ratio (< 300 mmhg), and may evolve to a respiratory failure, septic shock, and multiple organ failure [7] [8] [9]. furthermore, increased age and the presence of comorbidities, such as hypertension, diabetes, and coronary disease, are associated with mortality in covid-19 patients [10] [11] . the accurate diagnosis of covid-19 is carried out by searching the genetic material of the virus and, in a complementary way, by imaging methods. computed tomography and radiographs can identify lesions in the lungs due to viral multiplication [12] [13] . . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 23, 2020. . https://doi.org/10.1101/2020.05.20.20108415 doi: medrxiv preprint laboratory confirmation is essential for the timely management of cases to avoid the spread of transmission. however, brazil is far below the ideal number of tests for covid-19, as there are not enough laboratory inputs to understand the overall panorama of the virus' spread. furthermore, confirmatory molecular tests depend on the availability of imported reagents,whichare globally scarce, and on government investments that prioritize this strategy. this scenario leads brazil to have a delay in the number of covid-19 cases and deaths confirmations. these aspects become more aggravated when the patient evolves to death, because the effectiveness of tests, for these cases, is even more difficult. in addition, the recommendation is to collect blood and sputum to perform the culture, since these samples have a higher viral loadconsidering the studies done to date [14] . the difficulty regarding death registration have been also presented in the state of minas gerais, which, by the end of april 2020, had 584 suspected deaths notifications, of which 81 (13%) had not yet been confirmed or discarded [15] . thus, it is possible to state that there is a disparity between the real number of covid-19 deaths and the numbers that are reported in different brazilian sources of information, since not all deaths have been tested for confirmation or exclusion and are potentially being confirmed by others causes than covid-19. the present study aims to analyze the death causes in the notary records and in the brazilian national disease notification system records, and thus evaluate the subregistries and the possible increase in the frequency of deaths with clinically compatible causes to covid-19 in the minas gerais territory. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 23, 2020. [17] . for this study, the notary offices records were analyzed from january to april of 2020 in the state of mg. additionally, to assess the deaths excess in this period according to their causes, information from the sinan was accessed referring to the range of years 2017 to 2019. the notary data were obtained from the civil registry transparency portal, which is a free access platform developed to provide information about births, marriages, and deaths. due to the covid-19 pandemic, these data are being grouped in the special sections covid-19 and the covid registral panel made available on arpen database [18] . the information presented here (accessed on 05/05/2020) is based on death certificates (dd), presenting only one cause for each death certificate [19] . to evaluate sub-registrations in the different information systems in brazil, sinan data were collected through the infogripe platform of the oswaldo cruz foundation (fiocruz) infogripe database [20] is an initiative that aims to monitor and present alert levels for reported cases of severe acute respiratory syndrome (sars) in sinan [20] . the data in this system were compared with the notary data. on this platform, the records of sars and covid-19 were selected on 05/07/2020 according to . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 23, 2020. we also evaluated the death excess from causes that present clinical compatibility with covid-19, according to the following etiology: severe acute respiratory syndrome (sars), pneumonia, respiratory insufficiency (ri), sepsis (sepsis/septic shock), indeterminate causes (deaths related to respiratory diseases, but not conclusive), and other deaths (all other types of deaths that are not listed above) [19] . the data were collected and analyzed in spreadsheet by descriptive statistics and presented in raw numbers, relative frequency, and central tendency measures. to assess the death excess per epi week, the average, minimum and maximum values of deaths from the years 2017 to 2019 were calculated and confronted with diseases that presented changes in the pattern of distribution in the fourth quarter of 2020. all graphs were prepared using graphpadprism 7 software (graphpad software, inc. san diego, ca). a total of 201 covid-19 deaths were identified in the notary records and this number differs in 209.23% of the deaths registered in the sinan at the same period (table 1) . . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. regarding to the elevation in the rates of pneumonia and respiratory insufficiency from january to march compared to the same period in 2019, it was around 5.36% and 5.72%, respectively. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 23, 2020. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 23, 2020. . https://doi.org/10.1101/2020.05.20.20108415 doi: medrxiv preprint 9 the study points out divergences of information between different death registrations systems. the important increase in sars deaths that started earlier than those from covid-19, in epidemiological week 10, is also highlighted, suggesting the underreporting of covid-19 deaths in the state of minas gerais. the covid-19 situation is particularly challenging because, besides being a new and unprecedented disease, it is also capable of triggering other conditions, such as pneumonia and sars, which can be characterized as the main cause of death. in other words, the covid-19 may be the underlying cause, that is, it may not be the direct cause of death that has been registered. in this perspective, there is a subjectivity bias, since the physician can attest or not the death from covid-19 according to his clinical knowledge without the need of laboratory tests [21] . this finding corroborates with data from hubei, china and northern italy, where mortality calculations were adjusted for the biases of preferential verification, symptomatic and severe cases, and delay in death records. an increase in the mortality rate was found, which confirms the existence of underreporting covid-19 deaths in those regions [22] . in relation to underreporting in brazil, the ministry of health (mh) reports that the number of under-reported deaths is low according to the mortality information system (mis), because states and municipalities are advised to include deaths from covid-19, either confirmed cases or only suspects, in the system as a priority, in order to advance analysis of these cases [23] . however, our results show that there is a significant underreporting of the occurrences by covid-19, given the excess of sars deaths. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 23, 2020. . https://doi.org /10.1101 /10. /2020 another issue that should be analyzed is that although the civil registry information center takes into consideration both confirmed deaths and suspects, the mh discloses in its reports only the laboratory proven covid-19 deaths [23]. however, suspect deaths need to be considered in the count, even though it is noted that they have not been confirmed. this is stated because it is known that many of these deaths will not be able to be analyzed, given the difficulties in collecting, transporting, and wrapping the post-mortem samples. thus, if they are not mentioned, there will be a relaxation of the real situation in brazil and, consequently, in the state of mg. the brazilian mh also points out that in the same death certificate more than one cause of death can be described, so that the record of covid-19 can be associated with other diseases. however, the civil registry transparency portal presents these causes separately, even those included or registered in the same death certificate. thus, one cannot only add up the deaths made available on the portal by the different diseases, because they would generate false over-notification. a thorough investigation must be made when considering each death and the causes that were cited in the death certificate [23] . however, according to the hierarchical criteria exposed in the civil registry transparency portal, only one cause of death is selected to make the count, and not all the causes present in the same death certificate [24] , which validates the data exposed in this platform and the information presented here. it is worth noting that the different systems of deaths registration of the government, such as the municipalities and states, are not fully connected and that several of them depend on manual labor to be registered. this is capable of causing discrepancies and delays in data traffic and, consequently, in the production of timely and reliable information. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 23, 2020. . https://doi.org/10.1101/2020.05.20.20108415 doi: medrxiv preprint who has been advising countries on the need to expand laboratory testing capacity as a strategy to overcome the pandemic [25] . this action will enable the real knowledge of a population's immunity, providing reliable statistics for a better understanding of the circulation of the disease. consequently, strategies to control the pandemic and even the relaxation of non-pharmacological measures, such as social isolation and quarantines, may be proposed. in brazil, a network formed by referenced laboratories was established to help fight covid-19 [14] . however, the country is far below the optimal number of tests for covid-19, as there are not enough tests to have a reliable panorama of the real number of cases and deaths. this scenario leads brazil to have a delay in accounting the records of covid-19. in conclusion, our results reveal that covid-19 deaths in the state of minas gerais are higher than the official statistics presented. in view of these aspects, it is necessary to expand brazil's diagnostic capacity, which will allow us to recognize the real number of covid-19 deaths and cases in minas gerais. thanks to gabriela geraldo mendes and adélio tiago da mota for the collaborations. thanks to the department of collective health of the faculty of medicine of the federal university of uberlândia for the encouragement. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 23, 2020. . https://doi.org/10.1101/2020.05.20.20108415 doi: medrxiv preprint who director-general's opening remarks at the media briefing on covid-19 clinical characteristics of coronavirus disease 2019 in china examining the effect of social distancing on the compound growth rate of sars-cov-2 at the county level (united states) using statistical analyses and a random forest machine learning model. public health the covid-19 epidemic. tropical medicine & international health covid-19 coronavirus pandemic painel de casos de doença pelo coronavírus 2019 (covid-19) no brasil pelo ministério da saúde unique epidemiological and clinical features of the emerging 2019 novel coronavirus pneumonia (covid-19) implicate special control measures clinical, laboratory and imaging features of covid-19: a systematic review and meta-analysis. travel medicine and infectious disease sobre a doença clinical features of patients infected with 2019 novel coronavirus in wuhan, china. the lancet clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study. the lancet recomendações de uso de métodos de imagem para pacientes suspeitos de infecção pelo covid-19 essentials for radiologists on covid-19: an update radiology scientific expert panel diretrizes para diagnóstico e tratamento da covid-19 informe epidemiológico coronavírus international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted portal da transparência: registro civil do brasil em 2020 associação dos registradores de pessoas naturais (arpen) do brasil database monitoramento de casos reportados de síndrome respiratória aguda grave (srag) hospitalizados data (owd) database statistics and research coronavirus pandemic estimation of sars-cov-2 mortality during the early stages of an epidemic: a modelling study in hubei boletim epidemiológico especial 14 it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review)the copyright holder for this preprint this version posted may 23, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 23, 2020. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 23, 2020. . https://doi.org/10.1101/2020.05.20.20108415 doi: medrxiv preprint key: cord-016536-8wfyaxcb authors: ubokudom, sunday e. title: physical, social and cultural, and global influences date: 2012-02-20 journal: united states health care policymaking doi: 10.1007/978-1-4614-3169-5_6 sha: doc_id: 16536 cord_uid: 8wfyaxcb in chap. 5, we examined the technological environment of the health care policy-making system. specifically, we examined the classification, evolution, and diffusion of medical technology; the effects of medical technology on medical training and the practice of medicine; effects on medical costs, quality of care, and quality of life; effects on access to care; the ethical concerns raised by medical technology; and the practice of technology assessment. we concluded the chapter by observing that the growth of technology, as well as other human endeavors, affects other important aspects of our lives, most notably, the air we breathe, the food we eat, the generation of radioactive by-products and toxic chemicals, the manufacture of illicit drugs, and the generation of natural and man-made hazards. in other words, in addition to their effects on the health care system, technology and other human activities affect many other aspects of our lives that are associated with health. the who's defi nition of health as "a complete state of physical, mental, and social well-being, and not merely the absence of disease or infi rmity" (who 1948 ) , is primarily based on the wellness model. in this defi nition, emphasis is put on the fact that health is not merely the absence of disease, but also involves a social dimension. therefore, it also emphasizes the social and fi nancial support systems identifi ed in table 5 .1 of chap. 5 . this defi nition of health, as involving the combination of physical, mental, and social well-being led to the concept of the "health triangle." the health triangle left out the spiritual dimension of health, which has recently gained signifi cant attention in the literature due to a growing interest in the notion of holistic health. holistic health stresses the importance of all the things that make a person whole and complete. in addition to the three dimensions of the health triangle, of his analysis (szreter 2002 , p. 723) . subsequent studies revealed that the cessation of the large-scale redistribution of income and wealth from the very rich to the poorest in society had adverse effects on the health of the population. for example, when unhealthy behaviors and lifestyles were held as constant as possible, studies showed that people of lower socioeconomic status were more likely to die prematurely than were people of higher socioeconomic status (isaacs and schroeder 2004 , p. 1138; smith et al. 1996 , p. 486; davey smith et al. 1994 , p. 131) . the relationship between physical, social and cultural, and global environmental factors and health status is very well documented. in a letter to the editor of the jama , winkelstein ( 1993 winkelstein ( , p. 2504 argues that curative medical care, or those practices that are used for the care and rehabilitation of the sick, which involve most of the physical and designed social technologies listed in table 5 .1 of the previous chapter, is not the same as health care. medical care, as he defi nes it, makes only modest contributions to the health status of the population. on the contrary, the health status of the population is largely determined by a different set of factors that involve important physical, social, and economic components. these include preventive medicine, genetic predisposition, social and economic circumstances, environmental conditions, lifestyles and behaviors, and medical care (mckeown 1976 ; kannel et al. 1991 ; belloc and breslow 1972 , p. 409; bunker et al. 1989 ; bunker et al. 1995 , p. 305; marmot et al. 1991 marmot et al. , p. 1387 bell and standish 2005 , p. 339; mcginnis et al. 2002 , p. 78; wilkinson 1996 wilkinson , p. 1504 . we briefl y examine each of the identifi ed determinants of health below. preventive medicine seeks to minimize the occurrence of illness and disease. unlike the medical model that is reactive and seeks to contain disease and ill-health after they have occurred, preventive medicine is proactive and seeks to minimize the likelihood of the occurrence of disease and ill-health. generally, there are three areas or types of preventive measures, namely: primary prevention, secondary prevention, and tertiary prevention. primary prevention seeks to stop or minimize the development of disease or ill-health before it occurs. primary prevention may involve counseling against smoking, in order to prevent the development of chronic emphysema or chronic obstructive pulmonary disease (copd) and lung cancer. other primary interventions may include the promotion of an active lifestyle or exercise program, in order to minimize the likelihood of excess body fat and heart disease; driver education and mandatory seatbelt and motorcycle helmet laws, in order to reduce motor vehicle accidents and accidental head injuries; vaccinations for various forms of diseases and illnesses, such as measles and rubella, which can minimize the occurrence of early childhood diseases and mortality; and water purifi cation and sewage treatment programs that can minimize the occurrence of typhoid, cholera, and other waterborne diseases. secondary prevention involves the early detection and treatment of disease. health screenings and periodic and regular health examinations, such as hypertension screenings, mammograms, and pap smears, serve as examples of secondary prevention measures. these examples fall under the broad category of health promotion discussed in chap. 4 . the benefi ciaries of these programs are currently healthy people who are targeted to improve their health-related behaviors in order to minimize their chances of developing catastrophic and expensive illnesses. as was discussed in chap. 4 , secondary prevention measures are some of the most cost-effective steps employers take to lower their health benefi t costs ( coffi eld et al. 2001 , p. 1) . tertiary prevention measures involve steps taken to reduce the complications of diseases or illnesses, or to prevent further illnesses. they involve rehabilitative practices and the monitoring of the process of health care delivery. the infection control practices in hospitals and other improvements in the methods of health care delivery discussed in chap. 2 , under the postindustrial period of the evolution of the health care system, which are intended to reduce the occurrences of nosocomial infections and iatrogenic illnesses, are practical examples of tertiary prevention measures. other examples include patient education, nutrition counseling, and behavior modifi cation programs that seek to prevent the recurrence of disease and illness (timmreck 1994 , p. 17) . since the mid-1970s in the united states, there have been signifi cant reductions in heart disease, stroke, personal injury, and non-tobacco-related death rates foege 1993 , p. 2207; banta and jonas 1995 , p. 20) . similarly, the data presented in table 5 .3 of chap. 5 show signifi cant declines in death rates related to heart disease, cancer, stroke, infl uenza and pneumonia, chronic liver disease or cirrhosis, human immunodefi ciency virus (hiv) disease, suicide and homicides, from 1990 to 2006. these particular declines appear to be the result of preventive health measures, such as early screening, detection and treatment of hypertension, the provision and utilization of pneumonia and infl uenza vaccinations, moderate alcohol intake or abstinence, safe sex practices, suicide prevention and anger management programs, increased use of seatbelts and reductions in driving-underthe-infl uence episodes, smoking cessation, and the lowering of dietary fat and cholesterol. if, at least, some of the declines in mortality discussed above are due to preventive measures, the preventive strategy has yielded signifi cant gains in health. perhaps, it is this recognition of the importance of preventive services that led to the establishment of the us preventive services task force (uspstf) in 1984. most likely, it was the recognition of the crucial role that preventive medicine plays in enhancing population health that led to the convening of the uspstf in 1984 by the us public health service. the task force is a leading independent panel of nationally recognized nonfederal experts in prevention and evidencebased medicine. programmatic responsibility for the task force was transferred to the agency for health care research and quality (ahrq) in 1995 (uspstf procedure manual 2011 ). the uspstf is assigned the responsibility of making evidence-based recommendations that address primary and secondary preventive services targeting conditions that represent a substantial burden in the country, and that are provided in primary care delivery settings or made available through primary care referrals. the task force's recommendations are intended to improve clinical practice and promote the public health. tertiary prevention measures are outside the scope of the uspstf. even though the main audience for task force recommendations is the primary care provider, the recommendations are also used to guide programmatic, funding, and reimbursement decisions by policy-makers, managed care organizations, public and private payers, quality improvement organizations, research institutions, and consumers. beginning at the end of may 2007, the uspstf changed the grades it assigns to its recommendations. it assigns one of fi ve possible letter grades, a, b, c, d, or i, to each of its recommendations, including "suggestions for practice" associated with each grade. the agency also defi nes the levels of certainty regarding the net benefi t of each of its recommendations. the task force's 2009 reduction of the grade given for evidence quality from "b" to "c" for routine mammograms in women under the age of 50 years generated signifi cant controversy among health professionals and politicians (kinsman 2009 ) . in addition to the 2009 mammography recommendations stated above, the uspstf has recently recommended against screening for testicular cancer in adolescent or adult males (grade d recommendation) (uspstf 2011 , p. 483) . it has also concluded that there was insuffi cient evidence to assess the balance of benefi ts and harms of screening for bladder cancer in asymptomatic adults (moyer 2011 , p. 246) , and that prostate-specifi c antigen (psa) screening was associated with psychological harms, while its potential benefi ts remained uncertain (lin et al. 2008 , p. 192) . table 6 .1 shows the approach adopted by the agency in june 2007, to rank its recommendations. health is dependent upon biological factors. our predispositions to health or disease begin to take shape at the moment of conception. these predispositions are embedded in our genetic code. the genetic code guides the development of the proteins that determine our phenotypes (sizes, shapes, personalities, hair color, etc.) and genotypes or those aspects of our genetic codes that we cannot see, such as the biologic limit of our life expectancies (mcginnis et al. 2002 , p. 80; khoury et al. 1993 ; bell and standish 2005 , p. 339; starfi eld 1973 , p. 132; blum 1981 ; centers for disease control and prevention (cdc) 1979 ) . genetic factors predispose individuals to certain diseases. but although an individual may have a strong likelihood of developing a particular disease, this propensity to develop the disease is signifi cantly enhanced by environmental factors. for example, some studies demonstrate that there is a genetic basis for alcoholism (reich 1988 ) . but a person who has never taken a drink will not become an alcoholic. some triggers, in this case, the availability and consumption of alcohol, are necessary for the individual to progress from being genetically predisposed to alcoholism to actually (berkman and breslow 1983 ; burnett 1971 ; banta and jonas 1995 , p. 18; davis and webster 2002 , p. 13) . these examples suggest that the interaction between genetic factors and the environment in producing a particular disease is complex. while people have little or no control over their genetic makeups, the lifestyles and behaviors they freely choose and the surroundings where they live can have signifi cant infl uences on the likelihood of developing a particular disease to which they are genetically predisposed. to further the discussion of the infl uence of genetics on health, mcginnis et al. ( 2002 , p. 80) cite studies which show that although only about 2% of deaths in the united states may be attributed to purely genetic diseases, about 60% of late-onset disorders, such as diabetes, cardiovascular disease, and cancer, have some genetic component. for example, the brca1 gene accounts for only between 5 and 10% of breast cancers in the united states; only about 10% of colon cancers may be explained by genes, and only about 5% of elevated serum cholesterol levels may be explained by familial hyperlipidemia. similarly, studies of identical twins focusing on the occurrence of schizophrenia, and other twin studies examining the occurrence of dementia in older people, have found that about half of each might be explained by genetic factors. further, while about two-thirds of the risk of obesity might be genetic, the risk is expressed only with exposure to controllable lifestyle factors (baird 1994 , p. 133; muller 2000 , p. 7; panjukanta et al. 1998 , p. 369; kendler 1983 kendler , p. 1413 rowe and kahn 1998 ) . the institute of medicine (iom) ( 2003 , p. 20) reported that americans in 2003, compared with those who lived in 1900, were healthier, lived longer, and enjoyed lives that were less likely to be marked by injuries, ill health, or premature death. but the gains in health reported by the iom were not shared equally among the population of the united states. at the moment, as was also the case in 2003, gains in health status are not shared fairly or equally by all americans. americans with a good education, those who hold high-paying jobs, and those who live in serene and comfortable neighborhoods live longer and healthier lives than those with lower levels of education and income, and those who live in crime infested, overcrowded, and less comfortable and cohesive urban areas (isaacs and schroeder 2004 , p. 1137; bell and standish 2005 , p. 339; lantz et al. 1998 lantz et al. , p. 1703 navarro 2009 , p. 423; satcher 2010 , p. 6; williams 1990 , p. 81; metzler 2007 , p. 1; kilbourne et al. 2006 kilbourne et al. , p. 2113 berkman and lochner 2002 , p. 291) . there are several pathways through which social and economic circumstances affect health. those with good educational achievements are more likely to attain higher socioeconomic status than the poorly educated (angel et al. 2006 ; barr 2008 ; bartley 1994 ; mirowsky and ross 1986 , p. 23) . people of lower socioeconomic status die earlier and are more susceptible to undesirable life events than people on higher socioeconomic levels, a pattern that holds true in a progressive fashion from the poorest to the richest (mcleod and kessler 1990 , p. 162; adler et al. 1993 , p. 3140; adler and newman 2002 , p. 60; guralnik et al. 1993 , p. 110; mcdonough et al. 1997 mcdonough et al. , p. 1476 . this trend also holds whether one looks at education or occupation (national center for health statistics 1998 , p. 3; kaplan and keil 1993 , p. 1973 ). these differences are said to be due to the fact that people of higher socioeconomic status have healthier behaviors and lifestyles than those of lower socioeconomic status. people of higher socioeconomic status are less likely to smoke, and are far more likely to eat healthier foods and to engage in leisure-time physical exercise (national center for health statistics 2002 , p. 198; pratt et al. 1999 , p. s526; giles-corti and donovan 2002 , p. 601 ). according to isaacs and schroeder ( 2004 , p. 1138) , as a result of "a sedentary lifestyle and unhealthy eating habits, obesity and the diseases it fosters now characterize lower-class life." poor eating habits and a sedentary lifestyle alone do not explain the differences in health between high and low socioeconomic people. rather, another explanation for the differentials lies in the distribution of income or the income gradient between the low and high socioeconomic groups. in a study of white americans using 1980 census data, undertaken by smith et al. ( 1996 , p. 486) , men earning less than $10,000 per year were 1.5 times as likely to die prematurely as were those earning $34,000 or more. a similar study of british civil servants conducted about 2 years before the american study showed that when smoking and other risk factors were controlled for, those who were in the lowest employment category were more than twice as likely to die prematurely of cardiovascular disease as were those in the highest employment category (davey smith et al. 1994 , p. 131) . the fi ndings of these studies have led to the theory that inequitable distribution of income and wealth, or the socalled income and wealth gradient, causes poor health (sen 1992 (sen , p. 102, 1999 daniels et al. 2000 ; deaton 2002 , p. 13 ). as noted above, the relationship between health and income is referred to as a gradient. this terminology emphasizes the gradual relationship between the two variables. health improvements are directly related to improvements in income throughout the income distribution, and poverty has more than a "threshold" effect on health (deaton 2002 , p. 14) . the us national longitudinal mortality study (nlms) published by the national institutes of health (nih) ( 1992 ) showed that the proportional relationship between income and mortality was the same at all income levels, implying that the absolute reduction in mortality for each dollar of income was much larger at the bottom of the income distribution than at the top. apart from income, mortality is also known to decline with wealth, rank, and with social status (marmot et al. 1984 (marmot et al. , p. 1003 (marmot et al. , 1991 (marmot et al. , p. 1387 . similarly, studies also show marked differences in life expectancy by race and by geography or people's places of residence. for example, there is a 20-year gap in life expectancy between white men who live in the healthiest counties or localities and black men who live in the unhealthiest counties (murray et al. 1998 , p. 1; gittelsohn 1982 , p. 133; marmot 2006 marmot , p. 1304 kawachi and berkman 2003 ) . the brief discussion in this section points to the effects of numerous, and possibly interrelated, social and economic factors on health. income might affect health just as health might affect income; the distribution of income and wealth might affect health. similarly, education, race, minority status, geography, employment, housing, discrimination and social isolation, nutrition, lifestyle, stress, health practices, and coping skills might affect health. it does not appear to matter very much which of the above factors is stressed, especially since they are more likely to be interdependent than independent. disease risks exist, most often, along a continuum (rose 1994 ) . risks are rarely dichotomous. according to lochner ( 2002 , p. 2291) , there is no clear division between risk and no risk with regard to, for example, levels of blood pressure, cholesterol, alcohol or tobacco use, physical activity, diet and weight, etc. this gradient of risk also exists for many social and environmental conditions, such as socioeconomic status, social isolation, occupational and environmental exposure, and air quality. put differently, the numerous studies on the determinants of health that we are unable to fully summarize individually here for lack of space, point to the fact that even though the human and material resources at our disposal, the foods we eat, our levels of education, the houses we live in, the quality of the environments where we live and work, to name but a few, affect every person's health, the effects may vary in direction and scope from person to person, depending on the differences in their unique circumstances. improvement in environmental conditions is an important goal of the us government, as can be inferred from the emphasis on environmental quality outlined in healthy people 2010 . that document clearly states that factors in the physical and social environment play major roles in the health of individuals and communities. the physical environment is operationalized to include the air, water, and soil through which exposure to chemical, biological, and physical agents may occur. the physical environment can harm individual and community health, especially when individuals and communities are exposed to toxic substances, irritants, infectious agents, and physical hazards in homes, schools, and work sites. the physical environment can also promote good health, for example, by providing clean and safe places for people to work, exercise, and play ( healthy people 2010 , p. 19). therefore, the physical environment is perhaps one of the most important factors that should be considered when classifying the health status of an individual (wikipedia 2010 ) . environmental factors, such as air and water quality, exposure to pesticides and toxic waste, and housing conditions, have major effects on health and human development. for example, substandard air and water quality have been directly associated with diseases such as cancer, asthma, certain birth defects, and some neurological disorders (grant makers in health 2010 , p. 165) . similarly, many forms of cancer are associated with dioxin, polychlorinated biphenyls (pcbs), and mercury (friis 2007 ) . also, airborne particulate matter, tobacco smoke, and ground-level ozone, have been known to cause asthma attacks in children. exposure to lead, which can be found in peeling paint or in the soil and air in many poor communities, has been associated with impaired cognitive and behavioral development and low birth weight among children born to exposed mothers, and is also known to cause kidney damage (friis 2007 ) . in recognition of the danger of environmental contamination, bell and standish ( 2005 , p. 339 ) urge communities to act on their behalf to make changes in the policies that affect their physical, social, and economic environments. they state, plausibly, that "policy, place, and community" matter. combined, policy and community can alter or ameliorate the underlying forces that lie at the heart of the determinants of health. for example, they argue that policy determines the behaviors or things that are allowed, encouraged, discouraged, and prohibited. policy also determines whether industrial facilities will be sited near residential neighborhoods, how industrial facilities treat their neighbors; how dense neighborhoods will be; what materials can be used to build houses; who will live in a neighborhood; whether businesses can locate in a neighborhood; and whether there are tax or other incentives available for locating in a neighborhood (bell and standish 2005 , p. 340 ). in the developed communities or countries, environmental epidemiologists are concerned about such things as gene-environment interactions, environment-environment interactions, particulate air pollution, nitrogen dioxide, ground-level ozone, environmental tobacco smoke, radiation, lead, video display terminals, cellular telephones, and persistent organic pollutants (pops) that act as endocrine disruptors. exposure to these downstream or proximate environmental vectors (exposures that are closely related in time and space to the ill-effects they cause) affect both health and well-being (encyclopedia of public health 2010 ) . in the developing communities, the primary environmental determinants of health are said to involve biological agents in the air, water, and soil that account for most deaths. for example, diarrheal diseases acquired from contaminated food or water, malaria, intestinal parasitic infections, respiratory diseases caused by biological and chemical agents in both indoor and outdoor air, wreak havoc in the developing countries. these environmental hazards take a far greater toll on human life and suffering in absolute terms compared to those environmental vectors of concern in the developed countries (encyclopedia of public health 2010 ) . the above environmental vectors that cause havoc in the developing countries also abound in the poor localities of the united states and other developed countries. wealthy people are more likely to live in better homes and locations where they are less exposed to environmental risks than poor people (friis 2007 ; mcleod and kessler 1990 , p. 162; giles-corti and donovan 2002 , p. 601; shi and singh 2008 , p. 51; grant makers in health 2010 , p. 165) . for example, although the rates of asthma have been rising in the country, the disease affects low-income people disproportionately. whereas the national prevalence rate of childhood and adult asthma is put at about 7%, some african-american communities report about 25% of children suffering from asthma. also, puerto rican children are reported to have the highest prevalence of active asthma of any us ethnic or racial group. in california, latino children are reported to be hospitalized for asthma at a rate that is 10% greater than that of white children. obviously, environmental hazards are some of the reasons for these disparities ( healthy people 2010 ; joint center for political and economic studies and policylink 2004 , p. 6; flores et al. 2002 , p. 82) . despite the gains in environmental quality since the advent of the environmental movement in the 1970s, mainstream environmental policies neglected the problems identifi ed in low-income communities because the inhabitants of those areas lacked the political and economic resources to press for environmental justice. however, since its start around 1982, the environmental justice movement has resulted in the cleanup of hazardous waste sites, the redevelopment of brown-fi elds, the shutdown of incinerators, and the establishment of parks and conservation areas in low-income communities. additionally, in low-income communities, local pollution problems are being addressed, cleaner and more accessible means of public transportation are made available, and wild lands and unique habitats are being protected (faber and mccarthy 2001 ) . these changes are due to interest group pressure, the recognition of the externalities associated with environmental degradation, and the value of a clean environment to the health and well-being of all persons, rich and poor. mcginnis et al. ( 2002 , p. 82) contend that behavior choices constitute the single most important domain of infl uence over health prospects in the united states. lifestyle and behaviors involve many dimensions, including dietary choices, engagement in physical activity, sexual behavior and recreation, including the choice to smoke and to ingest alcohol, the wearing of motor vehicle seatbelts and motorcycle helmets, and other responsible behavior when operating motor vehicles. because lifestyle and behavioral factors are under the control of individuals, the public is very likely to defi ne lifestyle and behavioral health problems as being self-induced. the choices we make with regard to the many dimensions of lifestyle and behavior enumerated above have signifi cant impacts on personal and population health. for example, dietary factors have been associated with coronary heart disease and stroke; colon, breast, and prostate cancers; and diabetes (us department of health and human services 1988 ) . similarly, a sedentary lifestyle has been associated with increased risk for heart disease, osteoporosis, dementia, diabetes, and colon cancer (us department of health and human services 1996 ) . furthermore, research shows that diets rich in fruits and vegetables, low-fat dairy foods with reduced saturated and total fat, and low sodium diets can lower blood pressure (appel et al. 1997 (appel et al. , p. 1117 svetkey et al. 1994 , p. 285; sacks et al. 2001 , p. 3) . the primary differences between how we perceive behavioral change now from much earlier perceptions is the great awareness that individual behavior occurs in a social context (berkman and lochner 2002 , p. 292) , be it the place of work or abode, the family, the place of worship, the peer group, the school system, the stage of development, etc. for example, the results from the 2001 national youth risk behavior survey (yrbs) demonstrated that numerous high school students engaged in behaviors that increased their chances of dying from motor vehicle crashes, other unintentional injuries, homicide, and suicide. specifi cally, the survey results showed that 14.1% of those surveyed had rarely or never worn a seatbelt during the 30 days preceding the survey; 30.7% had ridden with a driver who had been drinking alcohol; 17.4% had carried a weapon during the 30 days preceding the survey; 47.1% had drunk alcohol during the 30 days preceding the survey; 23.9% had used marijuana during the 30 days preceding the survey; and 8.8% had attempted suicide during the 12 months preceding the survey (grunbaum et al. 2002 , p. 313) . the authors of the yrbs concluded that "priority health-risk behaviors, which contribute to the leading causes of mortality and morbidity among youths and adults, are often established during youth, extend into adulthood, are interrelated, and are preventable." the examination of the main causes of death in the united states, which we shall shortly discuss in the next section of this chapter, will shed further light on behavioral risk factors. meanwhile, suffi ce it to say that lifestyle and behavioral factors constitute some of the important determinants of health that health policy must seek to address. even though it is agreed that the contribution of medical care to improved health is not as pronounced as the other factors just examined, curative medical care-those practices, technologies, and organizations that society and the medical profession use to cure and rehabilitate the sick-is nonetheless a key determinant of health (blum 1981 ; cdc 1979 ) . the centers for disease control and prevention (cdc) estimate that only about 10% of premature deaths in the united states can be attributed to inadequate access to medical care, while the remaining 90% can be accounted for by individual lifestyle and behaviors (50%), genetic profi les (20%), and social and environmental conditions (20%) (cdc 1979 ) . the reason why medical care is the least important determinant of health is because it is reactive, not proactive-it waits for disease and illness to occur before intervening, so to speak. in other words, while individual and population health are somehow associated with having access to curative care, access to preventive services is of greater signifi cance. therefore, health can improve signifi cantly, and the prevalence of disease can decline dramatically, without effective medical care, due to the other determinants of health (sigerist 1970 , p. 46; mckeown 1976 , p. 93; banta and jonas 1995 , p. 19 ). this knowledge is very likely the reason why williams and jackson ( 2005 , p. 325 ) and isaacs and schroeder ( 2004 , p. 1141 ) advocate the broadening of the concept of health policy to include the other determinants of health that were not usually seriously considered when discussing health policy. this knowledge, too, is the primary reason for this chapter of the book. we can elaborate further on the importance and relevance of the determinants of health by linking them to the ten leading causes of death in the united states. where possible, the analysis will link the incidences of mortality reported in the country that are associated with each, some, or combinations of the determinants of health. table 6 .2 shows the ten leading causes of death in the united states for 2006 and 2007. we present, below, the ten leading causes of death in the country for 2006 and 2007 in order to attempt to link some of them to treatable or preventable behaviors and exposures. in other words, we shall attempt to show that most of the deaths can be associated with factors that mainly fall under the social, economic, environmental, and lifestyle and behavioral determinants of health that we have just discussed. most of the ten leading causes of death presented above are nongenetic and can be prevented or treated. diseases of the heart, cancers, cerebrovascular diseases or strokes, chronic lower respiratory diseases, unintentional injuries, diabetes, infl uenza and pneumonia, and infection-and high blood pressure-induced nephritis can be curtailed, prevented, or treated. for example, cigarette smoking is linked with an increased risk of heart disease, chronic lower respiratory disease, and cancer; obesity is a major health risk for diabetes, hypertension, coronary heart disease, and some forms of cancer; alcohol causes a wide variety of accidents and injuries, increases the risks for high blood pressure, irregularities of the heart, and stroke; fl u vaccines can minimize infl uenza deaths; and seeking treatment for infections can prevent septicemia. additionally, although there is a genetic basis for nephrosis and nephrotic syndrome, the conditions can occur as a result of infection (such as strep throat, hepatitis, or mononucleosis), use of certain drugs, and diabetes. furthermore, although age and family history are important risk factors for alzheimer's disease, longstanding high blood pressure and a history of head trauma are suspected risk factors for the disease as well mcginnis and foege ( 1993 , p. 2207) identifi ed and quantifi ed the major external or nongenetic factors that contributed to deaths in the united states in 1990. deaths associated with socioeconomic factors and access to medical care, although important contributors to the total deaths recorded in the country, were not included in the study because of the diffi culty quantifying them independent of the other factors reported in the study. about 10 years after the mcginnis and foege study, mokdad et al. ( 2004 mokdad et al. ( , p. 1238 ) used a similar methodology to quantify the nongenetic factors that contributed to deaths in 2000. the results of the two studies cited above showed that about half of all deaths that occurred in the united states in both 1990 and 2000 could also be attributed to a small number of largely controllable behaviors and exposures, including tobacco, diet and activity patterns, alcohol, microbial and toxic agents, fi rearms, sexual behavior, motor vehicle accidents, and illicit drug use. the results of the causes of death studies reported by mcginnis and foege and mokdad and his colleagues are consistent with the fi ndings of the 2001 national yrbs cited earlier in this chapter. the survey results showed that in the united states, 70.6% of all deaths among youth and young adults aged 10-24 years were due only to four causes: motor vehicle crashes, other unintentional injuries, homicide, and suicide. the deaths attributable to these causes among the identifi ed population group were 31.4, 12, 15.3, and 11.9%, respectively (grunbaum et al. 2002 , p. 313) . furthermore, substantial morbidity and social problems were said to result from the approximately 870,000 pregnancies that occurred each year among women 15-19 years (ventura et al. 2001 , p. 1) , and from the estimated million cases of sexually transmitted diseases (stds) that occurred each year among persons 10-19 years (institute of medicine 1997 ; eng and butler 1997 ) . similar to the studies on the actual causes of death in the united states in 1990 and 2000, the yrbs also found that the leading causes of mortality and morbidity among all age groups in the country were related to behaviors that contributed to unintentional injuries and violence, tobacco use, alcohol and other drug use, sexual behaviors that contributed to unintended pregnancies and stds, including hiv infection, unhealthy dietary behaviors, and sedentary lifestyles. in 2010, almost 10 years after the 2001 yrbs discussed above, the cdc quantifi ed the death rates among teenagers aged 12-19 years between 1999 and 2006. not surprisingly, the ten leading causes of death for the teenage population remained constant throughout the period. they were as follows: accidents or unintentional injuries, 48% of deaths; homicides, 13% of teenage deaths; suicide, 11%; cancer, 6%; and heart disease, 3%. further analysis showed that motor vehicle accidents accounted for almost three quarters (73%) of all deaths from unintentional injury; and that non-hispanic black males had the highest death rate among all teenagers, with homicide being the leading cause of death for them (minino 2010 ) . the determinants of health that have occupied our attention up to this point are not only affected by the broad national and personal factors we have identifi ed but are also affected by broad global or international factors (shi and singh 2008 , p. 53) . therefore, the rest of this chapter is devoted to examining the infl uences of global factors on the health care system and the health policymaking process. foreign policies involve the political relationships between countries and the outside world. foreign policy development generally concerns the protection of a country's national interests, usually defi ned in terms of security, economic prosperity, and ideological goals (lee et al. 2007 , p. 208) . increased globalization has led to the broadening of foreign policy concerns to include health. conversely, it is now recognized that international trade and fi nance, migration and population mobility, environmental change or global warming, the emerging and reemerging infectious disease paradigms, natural disasters, and global insecurity or terrorism have clear and observable consequences for human health (kassalow 2001 ; mcinnes and lee 2006 , p. 5; lee et al. 2007 , p. 208; katz and singer 2007 , p. 223; campbell-lendrum et al. 2007 , p. 235; fidler 2007 , p. 243; macpherson et al. 2007 , p. 200; labonte et al. 2009 ) . we shall briefl y examine how these components of globalizationinternational trade, population mobility, infectious diseases, global warming or climate change, and natural disasters and terrorism-affect countries' health care and policymaking systems generally, and the united states' health care and policymaking systems in particular. we begin with international trade. the principal agents of global international trade and fi nance include such international agencies as the world bank, the international monetary fund (imf), and the world trade organization (wto). it has been reported that the market-biased or effi ciency-oriented austerity policies these organizations promote or sponsor have resulted in reduced expenditures for social programs in developing countries, thereby impairing population health and slowing the advances in literacy, fertility reduction, and improved reproductive health of the women of the developing countries (kinnon 1998 , p. 397; gray 1998 ; watts 1997 ) . some specifi c examples of international trade and fi nance policies include the following: trade liberalization or the lowering of tariffs and other barriers to imports that has led to the doubling of the value of world trade from 24% of world gdp in 1960 to 48% in 2003 (world bank 2006 ; the reorganization of production and service provision across multiple national borders by multinational or transnational corporations, such as outsourcing or the pursuit of integration into global value chains, resulting in a global labor market (world bank 1995 , p. 2007 woodall 2006 ) ; the conditions attached to world bank and imf loans, and to the rescheduling of loan payments, including structural adjustment programs (saps); fi nancial liberalization, which exposes national economies to the uncertainties created by large and volatile short-term capital fl ows; the signifi cant growth in the world's urban population caused by transnational economic integration; the promotion of export-oriented agricultural development that does not consider the social and environmental consequences of such actions, which result from the pressures on governments around the world to increase export earnings (stonich and bailey 2000 , p. 23) ; and the promotion and reinforcement of a market-oriented concept of health sector reform that strongly favors private provision and fi nancing (petchesky 2003 ; koivusalo and mackintosh 2005 , p. 3). critics of the above international trade and fi nance policies argue that it is not at all clear that globalization leads to substantial poverty reduction. they point to the large-scale and extreme unequal distribution of wealth and income in the countries that have been identifi ed as "globalizers" witnessing rapidly growing economies. it is argued that even a little redistribution of income through progressive taxation and targeted social programs would go farther in terms of poverty reduction than many years of solid economic growth (jubany and meltzer 2004 ; paes de barros et al. 2002 ; de ferranti et al. 2004 ) . further, it is argued that as countries compete for foreign direct investment and outsourced production, the need to appear business-friendly may limit their ability to adopt and implement labor standards, occupational safety and health regulations, and other redistributive programs (cornia 2005 ) ; global integration of production may cause a sharp decline in the wages of, and demand for, low-skilled workers; large amounts of debt limit the ability of many developing and developed countries to meet other human needs related to health, education, water, public safety, sanitation, nutrition, etc.; globalization may lead to an intensifi cation of worldwide social relations which link distant localities in such a way that local happenings are shaped by events occurring many miles away, and vice versa (giddens 1990 , p. 64) ; much of the urbanization caused by international fi nance and trade policies occurs in countries that have limited resources to provide urban infrastructures; and the emphasis on private fi nancing and provision of health care leads to large-scale underinsurance and uninsurance in both the developed and developing countries (labonte and schrecker 2007 , p. 6) . globalization and the quest for exports are also blamed for increased smoking and tobacco-related mortality in the developing countries (murray and lopez 1997 , p. 1498) . also noteworthy is the escalation in the sale of weapons, much of it facilitated by western governments. the wars that have raged on and off in sub-saharan africa, latin america, and asia are tragic examples of the ill effects of aggressive weapon sales to these places (mcmichael and beaglehole 2000 , p. 497) . although the adverse effects of globalization discussed above tend to affect developing countries more than the united states, there are signifi cant adverse consequences of globalization for the united states as well. some of these include the perpetuation and exacerbation of the gap between the rich and the poor, a large public debt profi le that puts signifi cant pressure on social and other safety net policies and programs, the prevalence of uninsurance and underinsurance, job insecurity and reduced wages, the collapse of large manufacturing businesses, increased availability and demand for illicit drugs, and the emergence of new infectious diseases that spread more easily due to increased migration and population mobility (ubokudom and khubchandani 2010 , p. 20) . for example, american labor unions complain that the north american free trade agreement (nafta) with canada, mexico, and the united states, which came into force on january 1, 1994, has led to the loss of american jobs. job loss causes stress, loss of income and the fi nancial means to pay for medical care. from the onset, health issues were not at the heart or margins of foreign policy theory or practice for two reasons. first, the protection and promotion of population health did not factor into world leaders' calculations of what "competition in anarchy" (the condition from which foreign policy dynamics fl ow) required of their countries, nor was health for all seriously (as opposed to rhetorically) considered a pathway to a better world. second, those who were engaged in public health did not participate signifi cantly in discussions of foreign policy (fidler 2007 , p. 243) . therefore, there were only small and nonsubstantial linkages between health and foreign policy (harris 2004 , p. 171) . actions linking health issues or problems with foreign policy have been strongest when the potential impact on economic prosperity, national security, the environment, and development is severe. this has resulted in attention to health threats that are acute and severe, those that are projected to result in mass casualties, and those that are believed to be geographically widespread. in contrast, long-term health risks, or health risks that cause minor health problems, affect a limited number of people, or are not geographically widespread, attract little attention in relation to foreign policy. in other words, acute epidemic infections and major public health emergencies, such as natural or human-induced disasters, bioterrorism, and chemical and radiation accidents, have received signifi cant attention (fidler 2007 , p. 243; lee et al. 2007 , p. 208; katz and singer 2007 , p. 233) . a few specifi c examples of "attention-receiving" public health problems include the previously unknown human immunodefi ciency virus/acquired immunodeficiency syndrome (hiv/aids) which appeared in the united states in the early 1980s; the hantavirus, believed to have originated in korea; eastern equine encephalitis, which is found in the eastern and north-central united states, canada, parts of central and south america, and the caribbean islands; western equine encephalitis, which occurs primarily in the western and central united states, canada, and parts of south america; the polio virus that is believed to have originated in india in 2005; the spread of severe acute respiratory syndrome (sars) from china in 2003; and the 2009 outbreak of the deadly h1n1-swine flu-infl uenza believed to have originated in mexico (cdc 2009 ; shi and singh 2008 , p. 578; friis 2007 , p. 87) . in summary, many health problems, particularly infectious diseases, are widely recognized as global concerns that cross national and international boundaries. consequently, countries frequently include in their foreign policies strategies on these diseases that have the potential to threaten their domestic interests. this is likely to lead to higher prioritization, more attention, greater political support, and more funding. for example, in the united states, projections of the impact of hiv/ aids on the workforces of many countries, and the prevalence of hiv among military personnel in several regions of the world, contributed to the determination that hiv/aids was a security issue. similarly, awareness of the havocs caused by previous infl uenza pandemics and the economic impact of the small and short outbreak of sars led to serious preparations by the who and its member states for the next infl uenza pandemic (katz and singer 2007 , p. 234) . this understanding has led to many international agreements covering health and the environment, including the agreement on sanitary and phytosanitary measures, the international standards organization's classifi cation system for food labeling, the un framework convention on climate change, and the kyoto protocol, to name a few. data from the national aeronautics space administration (nasa) show that the earth's surface has warmed by about 0.66â°c between january and november 2010. that period was reported to be the warmest january-november in the nasa goddard institute for space studies (giss) analysis, which covers 131 years. the period was only a few hundredths of a degree warmer than 2005, so it is possible that the fi nal giss results for the full year, 2010, would be warmer or in the same range as 2005. further, the available data also show that the earth's surface has warmed by more than 0.8â°c over the past century and by about 0.6â°c in the past 3 decades (nasa 2010 ) . therefore, contrary to frequent assertions that global warming has slowed in the past decade, global warming has proceeded in the decade that ended in 2010 just as fast as it did in the prior 2 decades (nasa 2010 ) . the health hazards posed by climate change and global warming are inequitable, diverse, global, and probably irreversible over human time scales (patz et al. 2005 , p. 310; campbell-lendrum et al. 2007 , p. 235) . they include increased risks of extreme weather, such as fl oods and storms, fatal heat waves, long-term drought conditions in many areas of the world, surface water pollution and groundwater contamination, the melting of glaciers that supply freshwater to large population centers, salination of sources of agricultural and drinking water, increased rates of water extraction that may precipitate declines in supply, and creating a conducive environment for the global killers that are very sensitive to climatic conditions, such as malaria, diarrhea, and protein-energy malnutrition (campbell-lendrum et al. 2007 , p. 235; friis 2007 , p. 3) . as we noted under the actual causes of death, these three global killers cause many deaths in the united states; they are also said to account for about three million deaths worldwide each year (who 2004 ) . the relationship between migration, population mobility, and health is receiving renewed attention due to the emerging and reemerging infectious diseases that were discussed previously in this section. the health of both legal and illegal migrants to any country are affected by the determinants of health discussed earlier in this chapter, as well as by the risks that are present in their country of origin or that arise from the migration process itself (macpherson and gushulak 2001 , p. 390) . this is very true of the united states where a signifi cant portion of the annual population growth is due to migration. the effects of population mobility and migration on the country's health care system and the provision of health services are reported daily in the pages of newspapers. first, there is likely to be increased demand for services due to population growth, whether that growth is due to increased fertility rates or migration. for example, the exponential growth in medicaid expenditures in states that border mexico are said to be due to the increased demand for medical services by illegal immigrants as well as by the medical needs of an aging population. second, offi cials of the states that share boundaries with mexico complain about increased violent crimes committed by illegal immigrants, crimes that take a heavy toll on population health and health care expenditures. third, increased migration compels more health services planning, infrastructure maintenance, development and training of a diverse medical workforce to cater for the increasingly diverse population, and the establishment of public health programs for health promotion, health protection, and disease prevention (macpherson et al. 2007 , p. 200; cohen et al. 2002 , p. 90) . and, fourth, the opinion pages of newspapers carry citizens' letters that attribute the success of previous terrorist campaigns to the nearly open border policy the united states maintained prior to september 11, 2001 (9/11). since the 9/11 attacks, border security and entry visa requirements have been tightened. border control measures are now centered on inspecting and excluding goods, vessels, and people that pose serious health or terrorist threats to the united states. other countries have similar measures. the world has changed. indeed, the world has changed signifi cantly. while most people are actively planning on how to make their lives better, a few others are actively planning on how to destroy lives and settle political and ideological differences through acts of violence. no place and people are immune from the threats of violence, terrorism, and natural disasters. in the past 9 or 10 years, the united states has experienced disasters that have led to a rethinking of how to keep the population safe. the terrorist attacks in the united states on september 11, 2001, an unsuccessful attempt to initiate an anthrax epidemic in october 2001, and the devastation caused by hurricane katrina of the 2005 atlantic hurricane season led to signifi cant loss of lives and property and revealed defi ciencies in the public health and emergency response systems in the country. because of both underfunding and understaffi ng, and perhaps because the changes that have taken place in the world were not anticipated, the public health system was unable to develop or implement a comprehensive program of preparedness, prevention, response, and recovery (us general accounting offi ce 2003 ) . following the disasters, state, local, and federal public health agencies began to identify weaknesses in the nation's public health infrastructure and to reevaluate existing disaster response plans (baker and koplan 2002 , p. 15 ). the shortcomings revealed in the nation's disaster response plans elevated public health to an important national instrument for anticipating and dealing with terrorism, infectious disease outbreaks, and natural disasters. the guidance on responses to chemical, biological, radiological, nuclear, and explosive threats provided by the cdc, and by other national organizations and universities, helped individual state governments to develop statewide policies that took their unique concerns into account (ziskin and harris 2007 , p. 1584; shah 2006 shah , p. 1414 gebbie and turnock 2006 , p. 923) . public health plans to deal with terrorist threats, infectious diseases, and natural disasters now involve public health agencies at the federal, state, and local levels of government; other government and private agencies, such as the departments of justice and defense; the food and drug administration; private, public, and nonprofi t hospitals, clinics, and nursing homes; private and public practitioners, such as nurses and physicians; blood supply organizations, such as the american red cross; police and fi re departments; and individuals and groups throughout the country. as would be expected, expenditures for government public health activities, while still low relative to expenditures for medical care, rose from $47 billion in 2001 to about $69.4 billion in 2008, an increase of 47.6% from 2001 (centers for medicare and medicaid services (cms) 2010 ) . it remains to be seen if this enthusiasm for public health, demonstrated by increased funding since 2001, can be sustained. the law that is used as the basis for most of the new emergency preparedness measures is the homeland security act of 2002 . in addition to the strengthening of the public health infrastructure, the law also called for improved inspections of food products entering the united states. it calls for better measures to contain attacks on food and water supplies, to protect vital infrastructures, such as nuclear facilities, and to track biological materials anywhere in the country. further, the provisions of the law have been used to justify tough and controversial interrogation techniques, such as waterboarding. similarly, presidential executive order 13295, signed by george w. bush on april 4, 2003, authorizes the apprehension, detention, or conditional release of individuals with suspected communicable diseases, such as sars, cholera, diphtheria, infectious tuberculosis, plague, smallpox, yellow fever, and viral hemorrhagic fevers such as ebola (the free dictionary 2008 ) . in summation, international trade and fi nance, infectious disease epidemics, global warming and climate change, population mobility, and natural disasters and terrorism signifi cantly affect the united states health care delivery and policymaking systems. in addition, medical technology and us health care professionals and consumers are also affected by global factors. for example, because the united states is widely believed to be the world leader in the development and utilization of high-technology medical protocols, foreign dignitaries come here for specialty care. also, nurses and foreign medical school graduates (fmgs) move to the united states to acquire licenses to practice in the country. this so-called brain drain causes shortages of medical practitioners in the developing countries and alleviates some of the shortages in the health professional shortage areas of the united states. furthermore, telemedicine allows us physicians to transmit radiological images to other countries where they are analyzed at lower costs. on the other hand, us consulting pathologists and radiologists provide their services to other parts of the world. also, advanced medical equipment and supplies that are abandoned here a few years after deployment are shipped to the developing and less technology-intensive developed countries at low costs. the high costs paid by us consumers are used to subsidize the low costs paid by the developing countries (ubokudom and khubchandani 2010 , p. 21) . this chapter has identifi ed the impacts of physical, social, cultural, and global factors on health and health policymaking. health can be defi ned under the medical or wellness models. the health status of the us population, or the population of any other country for that matter, is largely determined by factors that have important physical, social, and economic dimensions. these include preventive medicine, genetic disposition, social and economic circumstances, environmental conditions, lifestyles and behaviors, and medical care. these determinants of health are associated, in various degrees, with the real or actual causes of death in the country. research demonstrates that most of the deaths in the country are attributable to a small number of largely controllable behaviors and exposures, or due to factors that fall under the preventive, social, economic, environmental, and lifestyle and behavioral determinants of health. these determinants of health are not only affected by the broad national and personal factors identifi ed in the chapter, they are also affected by global or international factors, including trade and fi nance, outbreaks of infectious diseases, climate change, natural disasters, and the threats of terrorism and population mobility. but even though most of the deaths in the country are the result of social, cultural, economic, environmental, and global factors, medical care is also an important determinant of health that cannot be ignored. an insurance card is one of the important factors that infl uence access to medical services. consequently, the next chapter examines demographic factors, most especially americans' ability to access medical services, and the disparities in health among segments of the population. socioeconomic inequalities in health: no easy solution socioeconomic disparities in health: pathways and policies poor families in america's health care crisis a clinical trial of the effects of dietary patterns on blood pressure in why are some people healthy and others not? strengthening the nation's public health infrastructure: historic challenge, unprecedented opportunity in jonas's health care delivery in the united states health disparities in the united states: social class, race, ethnicity and health unemployment and ill health: understanding the relationship communities and health policy: a pathway for change relationship of physical health status and health problems health and ways of living: the alameda county study social determinants of health: meeting at a crossroads planning for health pathways to health: the role of social factors the role of medical care in determining health: creating an inventory of benefi ts genes, dreams, and realities global climate change: implications for international public health policy healthy people: the surgeon general's report on health promotion and disease prevention national health expenditure projections priorities among recommended clinical preventive services the case for diversity in the health care workforce policy reform and income distribution is inequality bad for our health explanations for socioeconomic differentials in mortality: evidence from britain and elsewhere the social context of science: cancer and the environment policy implications of the gradient of health and wealth inequality in latin america & the caribbean: breaking with history? the hidden epidemic: confronting sexually transmitted diseases green of another color: building effective partnerships between foundations and the environmental justice movement the health of latino chindren: urgent priorities, unanswered questions, and a research agenda essentials of environmental health the public health workforce, 2006: new challenges the consequences of modernity socioeconomic status differences in recreational physical activity levels and real and perceived access to a supportive physical environment on the distribution of underlying causes of death social determinants of health false dawn: the delusions of global capitalism youth risk behavior surveillance-united states educational status and active life expectancy among older blacks and whites marrying foreign policy and health: feasible or doomed to fail? united states department of health and human services (usdhhs) deaths: leading causes for 2007 institute of medicine (us) committee on health and behavior: research, practice, and policy. health and behavior: the interplay of biological, behavioral, and societal infl uences class-the ignored determinant of the nation's health breathing easier: community-based strategies to prevent asthma the achilles' heel of latin america: the state of the debate on inequality , fpp 04-5. ottawa, canada. canadian foundation for the americas (focal) regional obesity and risk of cardiovascular disease: the framingham study socioeconomic factors and cardiovascular disease: a review of the literature why health is important to u.s. foreign policy health and security in foreign policy neighborhoods and health overview: a current perspective on twin studies of schizophrenia fundamentals of genetic epidemiology advancing health disparities research within the health care system: a conceptual framework world trade: bringing health into the picture statement on the politicization of evidence-based clinical research in commercialization of health care: global and local dynamics and policy responses globalization and the social determinants of health: the role of the global marketplace (part 2 of 3) socioeconomic factors, health behaviors, and mortality: results from a nationally representative prospective study of u.s. adults bridging health and foreign policy: the role of health impact assessments religion and health: is there an association, is it valid, and is it causal? benefi ts and harms of prostate-specifi c antigen screening for prostate cancer: an evidence update for the u.s. preventive services task force health and foreign policy: infl uences of migration and population mobility human mobility and population health: new approaches in a globalizing world health inequalities among british civil servants: the whitehall ii study inequalities in death-specifi c explanations of a general pattern the spiritual history religion and depression: a review of the literature religious involvement and mortality: a meta-analytic review income dynamics and adult mortality in the united states the case for more active policy attention to health promotion actual causes of death in the united states health, foreign policy and security the role of medicine: dream socioeconomic status differences in vulnerability to undesirable life events the changing global context of public health septicemia social determinants of health: what, how, why, and now social patterns of distress mortality among teenagers aged 12-19 years: united states actual causes of death in the united states screening for bladder cancer: u.s. preventive services task force recommendation statement hereditary colorectal cancer: from bedside to bench and back patterns of mortality by county and race: 1965-94. cambridge, ma: harvard center for population and development studies alternative projections of mortality and disability by cause 1990-2020: global burden of disease study (phs) 2002-1232), 1-430. national institutes of health. 1992. a mortality study of 1.3 million persons by demographic, social, and economic factors: 1979-1985 follow-up what we mean by social determinants of health meeting the millennium poverty reduction targets in latin america and the caribbean letter: linkage of familial combined hyperlipidaemia to chromosome 1q21-q23 impact of regional climate change on human health global prescriptions: gendering health and human rights levels of physical activity and inactivity in children and adults in the united states: current evidence and research issues nephrotic syndrome: nephrosis alzheimer's disease; senile dementia-alzheimer's type (sdat) retrieved biologic-marker studies in alcoholism factors infl uencing the view of patients with gynecologic cancer about end-of-life decisions the strategy of preventive medicine successful aging effects on blood pressure of reduced dietary sodium and the dietary approaches to stop hypertension (dash) diet commentary: include a social determinants of health approach to reduce health inequities development as freedom the formation of the emergency medical services delivering health care in america: a systems approach socioeconomic differentials in mortality risk among men screened for the multiple risk factor intervention trial: i. white men resisting the blue revolution: contending coalitions surrounding industrial shrimp farming health services research: a working model effects of dietary patterns on blood pressure: subgroup analysis of the dietary approaches to stop hypertension (dash) clinical trial rethinking mckeown: the relationship between public health and social change severe acute respiratory syndrome (sars) poor diets, little exercise leading cause of preventable illness and deaths world development report 1995: workers in an integrating world an introduction to epidemiology the ecology of health policymaking and reform in the united states of america united states department of health and human services (usdhhs). 1988. the surgeon general's report on nutrition and health physical activity and health: a report of the surgeon general bioterrorism: public health response to anthrax incidents of section 1: overview of u.s. preventive services task force structure and processes trends in pregnancy rates for the united states, 1976-97: an update epidemics in history: disease, power and imperialism unhealthy societies: the affl iction of inequality social sources of racial disparities in health socioeconomic differences in health: a review and redirection men's health: chronic lower respiratory diseases world health organization (who). 1948. preamble to the constitution of the world health organization as adopted by the international health conference state health policy for terrorism preparedness key: cord-301300-nfl9z8c7 authors: slavova, svetla; larochelle, marc r.; root, elisabeth; feaster, daniel j.; villani, jennifer; knott, charles e.; talbert, jeffrey; mack, aimee; crane, dushka; bernson, dana; booth, austin; walsh, sharon l. title: operationalizing and selecting outcome measures for the healing communities study date: 2020-10-02 journal: drug alcohol depend doi: 10.1016/j.drugalcdep.2020.108328 sha: doc_id: 301300 cord_uid: nfl9z8c7 background: the helping to end addiction long-term (healing) communities study (hcs) is a multisite, parallel-group, cluster randomized wait-list controlled trial evaluating the impact of the communities that heal intervention to reduce opioid overdose deaths and associated adverse outcomes. this paper presents the approach used to define and align administrative data across the four research sites to measure key study outcomes. methods: priority was given to using administrative data and established data collection infrastructure to ensure reliable, timely, and sustainable measures and to harmonize study outcomes across the hcs sites. results: the research teams established multiple data use agreements and developed technical specifications for more than 80 study measures. the primary outcome, number of opioid overdose deaths, will be measured from death certificate data. three secondary outcome measures will support hypothesis testing for specific evidence-based practices known to decrease opioid overdose deaths: (1) number of naloxone units distributed in hcs communities; (2) number of unique hcs residents receiving food and drug administration-approved buprenorphine products for treatment of opioid use disorder; and (3) number of hcs residents with new incidents of high-risk opioid prescribing. conclusions: the hcs has already made an impact on existing data capacity in the four states. in addition to providing data needed to measure study outcomes, the hcs will provide methodology and tools to facilitate data-driven responses to the opioid epidemic, and establish a central repository for community-level longitudinal data to help researchers and public health practitioners study and understand different aspects of the communities that heal framework. the helping to end addiction long-term (healing) communities study (hcs) is a multisite, parallel-group, cluster randomized wait-list controlled trial evaluating the impact of the communities that heal intervention to reduce opioid overdose deaths and other associated adverse outcomes (walsh et al., in press) . the intervention includes three components: (1) a community-engaged and data-driven process to assist communities in selecting and implementing evidence-based practices to address opioid misuse and opioid use disorder (oud), and reduce opioid overdose deaths (martinez et al., in press) ; (2) the opioid reduction continuum of care approach which contains a compendium of evidence-based practices and strategies to expand opioid overdose education and naloxone distribution, medications for opioid use disorder (moud), and safe opioid prescribing (winhusen et al., in press) ; and (3) community-based health communication campaigns to increase awareness and demand for the evidence-based practices and reduce their stigma (lefebrve et al., in press) . a total of 67 communities across four highly affected states (kentucky, massachusetts, new york, ohio) were recruited to participate in the hcs and randomized to one of two waves in a wait-list, controlled design. the communities were randomized to receive either the intervention (referred to as wave 1 communities) or a waitlist control (referred to as wave 2 communities). the hcs has one primary hypothesis (h1) and three secondary hypotheses (h2, h3, h4) (walsh et al., in press) . it is hypothesized that during the evaluation period (january 1, 2021 to december 31, 2021), wave 1 communities compared with wave 2 communities, will: h1: reduce opioid overdose deaths (primary outcome); h2: increase naloxone distribution; j o u r n a l p r e -p r o o f h3: expand utilization of moud; and h4: reduce high-risk opioid prescribing. quality data are needed to measure the study outcomes and assess the impact of the integrated intervention and the specific evidence-based practices. data are also an important component of the intervention because communities can use data on opioid overdose mortality and morbidity supplemented with data on community resources and needs to develop a datadriven action plan to expand the utilization of evidence-based practices. communities also need timely and accurate data for visualization in data dashboards designed to monitor the uptake and success of the selected evidence-based practices and strategies, and respond to emerging challenges and community needs (wu et al., in press) . this article describes the process for using administrative data to develop the hcs outcome measures aligned with the primary and three secondary hypotheses of the study. each research site developed collaborations and partnerships with state agencies and other data owners to understand the regulations and policies governing the use of administrative data for research. an hcs data capture work group was formed and included representatives from the four research study sites, the data coordinating center at the rti international, and the sponsors (the national institute on drug abuse and the substance abuse and mental health services administration [samhsa]). a structured consensus decision-making strategy was used to: a. identify data sources to measure the primary, secondary, and other study outcomes; c. develop data governance strategy and data use agreements; d. develop study measure definitions, technical specifications, programming code, procedures for data quality control, common data model, and schedule for data transfer to the data coordinating center; during development, priority was given to use of existing state-level administrative data sources with regulated and sustained data collections and established infrastructures for quality assurance and control. this is an efficient and cost-effective way to study community-level changes, capitalizing on the federal and state investments for collecting standardized surveillance data, and adopting, when possible, validated surveillance definitions. in addition, using multiple administrative data sources allowed for the construction of measures at the community/population level (i.e., unit of analysis being hcs community) by aggregating individual-level data (e.g., unit of measurement being a community resident or a provider practicing within an hcs community) that best matched hcs outcomes. priority also was given to data sources with timely reporting, preferably with less than a 6-month lag between the occurrence of events and data availability. timeliness and near-realtime access to data was critical for three reasons: (1) the community engagement component of the intervention is data-driven and dependent on providing ongoing data feedback to community partners throughout the process (walsh et al., in press); j o u r n a l p r e -p r o o f (2) it is imperative that the study results are made publicly available quickly because of the magnitude and impact of the opioid crisis on us communities; and (3) the hcs was designed as a four-year study. this study protocol (pro00038088) was approved by advarra inc., the healing communities study single institutional review board. the study is registered with this section presents the results from the selection and operationalization of administrative data measures for study hypotheses testing (table 1) , as well as study measures for secondary analyses and monitoring the progress in implementing evidence-based practices ( the primary hcs outcome is the number of opioid overdose deaths among residents in hcs communities. the traditional data source for capturing drug overdose deaths are death certificate records (isw7, 2012; hedegaard et al., 2020; warner et al., 2013) . suspected drug overdose deaths are considered unnatural deaths and are subject to medicolegal death j o u r n a l p r e -p r o o f investigation before the death is certified by a coroner or a medical examiner (hanzlick, 2014; hanzlick and combs, 1998) , and a completed death certificate is filed with the office of vital statistics in the state where the death occurred (nchs, 2003a, b) . selected fields from the death certificate record are then sent to the national center for health statistics where the cause-ofdeath information is coded with one underlying and up to 20 multiple (i.e., supplementary) cause-of-death codes using the international classification of diseases, tenth revision (icd-10) coding system (who, 2016) . the cdc definition for identifying drug overdose deaths with opioid involvement in icd-10-coded death certificate records is commonly accepted by researchers and public health agencies. using icd-10-coded death certificate data, drug overdose deaths are identified as deaths with an underlying icd-10 cause-of-death code x40previous research has identified several methodological challenges for identification of opioid involvement in drug overdose deaths (e.g., lack of routinely performed postmortem toxicology testing, especially for fentanyl and designer opioids; challenges to detection and quantification of new designer opioids; variation in jurisdictional office policy in completion of drug overdose death certificates; differences in the proportion of drug overdose death certificates completed by different jurisdictions that do not list the specific contributing drugs) (buchanich et al., 2018; ruhm, 2018; slavova et al., 2015; slavova et al., 2019; warner and hedegaard, 2018; j o u r n a l p r e -p r o o f warner et al., 2013) . prior to the evaluation period, the research sites are administering surveys among the coroners, medical examiners, and toxicology labs serving both wave 1 and wave 2 communities to collect information related to death investigations of suspected drug overdose deaths (including postmortem toxicology testing, timelines for death certificate completion, and possible covid-19-related changes in these processes that could have lasting effects during the hcs evaluation period) in order to understand possible limitations and changes in the completeness and accuracy of the primary outcome measure. each hcs research site will use death certificate records from their state office of vital statistics to identify hcs resident deaths with opioid contribution. one challenge in using death certificate data for the primary study outcome is the lag between the death date and the date when death certificate records are available for analysis (rossen et al., 2017) . sites have been working with local coroners, medical examiners, and state vital statistics offices to improve the timeliness of data availability across all hcs communities. in 2019, almost all the death certificate records in kentucky, massachusetts, new york, and ohio were available for analysis within 6 months after the overdose death (cdc, 2020). the following steps describe the hcs operational definition for capturing opioid overdose deaths for testing the primary study hypothesis:  step 1: all sites will use state death certificate files captured 6 months after the end of the evaluation period to identify the death certificate records for residents of hcs communities with a date of death within the evaluation period, an underlying cause-ofthis process will ensure a quality harmonized measure that is captured consistently across the four research sites. number of naloxone units distributed in an hcs community as measured by the sum of the naloxone units (1) the us surgeon general's advisory on naloxone emphasized that expanding naloxone availability in communities is a key public health response to the opioid crisis (hhs, 2018) . research has shown that opioid overdose death rates were reduced both in communities that implemented overdose education and naloxone distribution programs (walley et al., 2013) and in jurisdictions enacting laws allowing direct pharmacist dispensing of naloxone (abouk et al., 2019) . there are three limitations of this data source: (1) no information is provided about the number of pharmacies dispensing naloxone prescriptions; (2) suppression rules preclude reporting of data for geographic areas with fewer than four pharmacies; and (3) prescriptions are assigned to communities based on the location of the pharmacy rather than the customer's residence. suppression rules impacted three communities in massachusetts; this was resolved by requesting the total for the three communities and dividing it relative to the community populations. the assignment of a pharmacy to a community based on pharmacy address may result in an overcount of naloxone in a community with pharmacies that serve residents of non-hcs communities or an undercount if a pharmacy is just outside an hcs community but serves hcs residents. a limitation of the measure is that it may not capture naloxone distributed in hospitals, correctional facilities, or other venues when the naloxone is purchased with support from private donations, foundations, or locally awarded federal funding. the number of hcs residents receiving buprenorphine products approved by the food there are three fda-approved moud products: buprenorphine, methadone, and naltrexone. multiple randomized controlled trials (krupitsky et al., 2011; mattick et al., 2009 mattick et al., , 2014 have demonstrated that moud can reduce cravings and illicit opioid use. observational studies have identified that buprenorphine and methadone are associated with reduced mortality sordo et al., 2017) . thus, as part of the opioid reduction continuum of care approach, communities are required to expand moud with buprenorphine and/or methadone (winhusen et al., in press) . access to moud is geographically heterogeneous and differs by patient population (haffajee et al., 2019; pashmineh azar et al., 2020) . for example, opioid treatment programs providing methadone are less common in rural than urban areas (joudrey et al., 2019) . criminal justice-involved populations, where there has been a historical j o u r n a l p r e -p r o o f preference toward naltrexone (krawczyk et al., 2017) , are less likely to receive buprenorphine and methadone. there also is a great deal of variation in billing and documentation of the type of moud, administration modality (e.g., office-based administration as compared with prescription filled at pharmacy by patient), provider type, state policies, and insurance coverage. accurate estimation on the prevalence of oud in hcs communities is important for planning and scaling of the moud uptake. however, estimating the population at need for moud is a challenge for the hcs. the hcs team is working on developing improved estimations for oud prevalence in each hcs community using a capture-recapture statistical methodology previously applied by barocas et al. (barocas et al., 2018) . five potential sources for measurement of moud were identified: medicaid claims, allpayer claims databases, pdmps, opioid treatment program central registries, and pharmacy dispensed prescriptions (iqvia). the disparate data sources vary in completeness and timeliness. all-payer claims databases are large state databases that typically include medical claims across multiple settings (e.g., hospitalizations, emergency departments visits, outpatient visits), pharmacy claims, and eligibility and provider files. data are collected from both public and private payers and reported directly by insurers to a state repository. all-payer claims databases are structured similarly to medicaid claims data and allow for linking of individuals across claims to identify individuals with oud and their treatments. the key advantage is the inclusion of private insurance, allowing more accurate estimation of prevalence of individuals with diagnosed oud and treatment with moud in a state. all-payer claims have been used previously in oud-related research (burke et al., 2020; freedman et al., 2016; lebaron et al., 2019; saloner et al., 2017) . seventeen states have all(grecu et al., 2018) . opioid treatment programs are the only facilities allowed to deliver methadone for oud but may also offer buprenorphine and naltrexone along with behavioral therapy. they must be certified by samhsa and an independent, samhsa-approved accrediting body to dispense moud (samhsa, 2020). they also must be licensed by the state in which they operate and must be registered with the drug enforcement administration. the registries are established to prevent patient's simultaneous enrollment in multiple locations (e-cfr, 2020a). number of enrolled patients, aggregated at the hcs community level, as permitted by section §2.52 research, 42 cfr part 2 (e-cfr, 2020a) can be used as a measure for methadone treatment uptake, but central registries were not available in all four research sites. iqvia data capture pharmacy dispensed naltrexone. however, naltrexone is indicated for treatment of oud and for alcohol use disorder. because pharmacy records do not include diagnose-related information for making this distinction, this data source may overestimate the uptake of naltrexone for oud. defined as ≥90 mg mme over 3 calendar months; or (4) incident overlapping opioid and benzodiazepine prescriptions greater than 30 days over 3 calendar months. high opioid dosages, co-prescribing opioids with benzodiazepines or other sedative hypnotics, and receipt of opioid prescriptions from multiple providers or pharmacies are associated with opioid-related harms (bohnert et al., 2011; cochran et al., 2017; dunn et al., 2010; rose et al., 2018) . characteristics of opioid initiation are also important. for example, initiating opioid treatment with extended-release/long-acting opioids (miller et al., 2015) is associated with increased risk of overdose, and longer prescription duration is associated with transition to long-term opioid use (shah et al., 2017) . based on available evidence, in 2016 the cdc published guidelines (dowell et al., 2016) for prescribing opioids for chronic pain. numerous quality measures have been developed to encourage and measure progress toward improving the safety of opioid prescribing. after decades of increases, rates of opioid prescribing peaked and are now declining, although they remain historically high (guy et al., 2017; schieber et al., 2020) . developing safe and patient-centered approaches for individuals receiving long-term opioid therapy has been a challenge to address in underlying evidence or guidelines. increasing two constructs with the best available evidence to support decreases in opioid-related harms were targeted with the intention of reducing the number of individuals initiating high-risk opioid prescribing and the likelihood that new opioid prescribing episodes develop into longterm episodes (shah et al., 2017) . state pdmps were identified as the best available data source for these measures across all four research sites. a limitation of these data is the lack of clinical context-such as diagnostic codes for disease or condition-associated with the prescribed medication. as a result, at the patient level, it is difficult to assess the appropriateness of a high-dose opioid prescribing episode, such as that needed for management of severe pain for patients with cancer or end-of-life care. another limitation of this measure is the lack of automated data sharing among state pdmps on prescriptions filled across state boundaries. a benefit of the pdmp data source is that it is timely and captures dispensed prescriptions for controlled substances paid for by both insurance and cash. medicaid claims and all-payer claims databases are alternative data sources. the main advantage of claims data compared with pdmp data is the clinical context. however, claims data lack information on prescriptions paid by cash or alternative insurance coverage, which is associated with increased risk of opioid-related harms (becker et al., 2017) . medicaid claims are common across the sites, whereas all-payer claims databases exist in only two of the four states. claims data lag by at least 6 months, making them less useful for timely monitoring of progress. existing measures were identified through a review of the literature, including existing measures from cdc and national quality forum, and national committee for quality alliance, which were subsequently adapted to the constructs identified above. all opioid agonist medications, including tramadol, were included, with the exception of antitussive codeine formulations and buprenorphine formulated for pain. the reasons for their exclusion are a lack of clear guidance for conversion to morphine milligram equivalents and a lack of evidence that buprenorphine, a partial opioid agonist, conveys the same risk as this of full opioid agonists. to maintain consistency across sites, the team developed a standardized list of national drug codes for opioids, benzodiazepines, and moud using the medi-span electronic drug file (med-file) v2 and the drug inactive data file (wolters kluwer, 2020) . the standardized study drug list is updated quarterly. the med-file includes product names, dosage forms, strength, the ndc, and generic product identifier (gpi). the gpi is a 14-digit number that allows identification of drug products by primary and secondary classifications and simplifies identification of similar drug products from different manufacturers or different packaging. because our study requires baseline data on opioid utilization, the inactive date file is used to include drugs that may be currently inactive but were used during the baseline period. all gpis beginning with the classification "65"-which identifies any drug product containing an opioid or combination-are included in the opioid list. next, opioid products that are not likely to be used in the outpatient/ambulatory pharmacy setting-such as bulk powder, bulk chemicals, and dosage forms typically used in hospitals or hospice settings (e.g., epidurals, ivs)-are excluded. products classified as cough/cold/allergy combinations, cough medications, j o u r n a l p r e -p r o o f antidiarrheal/probiotic agents, buprenorphine products used for oud and pain, and methadone products used for oud were also excluded. the cdc file that identifies oral mmes (cdc, 2019b) was used to add mmes to each opioid product and to identify products as long-acting or short-acting. to ensure the hcs list includes all current and inactive products, the cdc list was cross-referenced with the list of all gpi products. the benzodiazepine products are identified using the gpi classification "57", which identifies any drug product containing a benzodiazepine or combination. products that are not likely to be used in the outpatient/ambulatory pharmacy setting-such as bulk powder, bulk chemicals, and dosage forms typically used in hospitals or hospice settings-were excluded. the full list of gpis for opioids, benzodiazepines, and buprenorphine are included in the appendix. the success of the intervention relies on the community's ability to assess the complexity and specifics of the local opioid epidemic and identify the best ways to implement and promote evidence-based practices locally. a set of additional measures was developed, to be shared with the intervention communities as counts and/or rates over time and visualized as trends on community-tailored dashboards (wu et al., in press) . these measures monitor the complexity of the opioid-related harms as well as the progress in the three main evidence-based practices from the opioid reduction continuum of care approach (winhusen et al., in press) . a list of selected study measures is provided in table 2 working closely with state stakeholders, the research sites also developed standard operating procedures for data quality assurance and control, and improved data collection (e.g., improved timelines of an existing data sources or development of new administrative data collections). the hcs data coordinating center created a common data model to match the complexity and scale of the clinical trial design and measures and the conditions of the data use agreements. the common data model consisted of (1) an internal identification number for each hcs measure outcome; (2) frequency of reporting (i.e., daily, monthly, quarterly, semi-annually, or annually); (3) display features for dashboards and visualization (i.e., display date, display value, research cite/research community identification number, label); and j o u r n a l p r e -p r o o f (4) internal usage information (i.e., is estimate, is suppressed [per data use agreement suppression requirements], notes, stratification, and version number). the common data model allows coordinated presentation of data to communities to aid with decision making and monitoring of progress and allows the hcs consortium and trial sponsors to routinely monitor progress. during the first year of the hcs, the data capture work group evaluated more than 15 administrative data sources across the four states for their ability to support study measures in multiple relevant domains. the research site teams established multiple data use agreements with data owners to support the calculation for more than 80 study measures based on administrative data collections, such as death certificates, emergency medical services data, inpatient and emergency department discharge billing records, medicaid claims, syndromic surveillance data, pdmp data, drug enforcement administration data on drug take back collection sites and events, data 2000 waivered prescriber data, hiv registry, naloxone distribution and dispensed prescription data. there were many challenges related to state variations in data timeliness and content that needed to be addressed, and compromises were made to achieve harmonization across research sites. the harmonization on medicaid measure specifications required participation from the state partners because individual states have some unique codes or code bundles for capturing specific services. collaborative workgroups with participation from state partners were formed with specific focus on medicaid data, pdmp data, and emergency medical services data. another challenge is the lack of quality validation studies for many of the measures, so the degree of possible misclassification of diagnosis or service codes used in some specifications j o u r n a l p r e -p r o o f is unknown. one example is attempting to identify oud prevalence using diagnosis codes in medical claims or other administrative data sources knowing that oud is often underdiagnosed. massachusetts also is seeking to partner with emergency medical services agencies to improve timeliness of data reporting and completeness of race/ethnicity data. new york developed a cloud-based application to facilitate data aggregation and sharing both for hcs and future research projects. in ohio, the hcs team partnered with the innovateohio platform, which was established by executive order a few weeks prior to the hcs project start date. the hcs has been a highly successful "test case" for how a single technology platform could be leveraged to provide necessary data quickly and efficiently for a large study involving multiple state agencies. the platform facilitated a multi-agency data use agreements, and curates, cleans, and links data sets across multiple ohio state agencies monthly. this allowed the ohio hcs team to sign one data use agreement to cover all project data activities. the hcs will provide methodology and tools to facilitate data-driven responses to the opioid epidemic at the local, state, and national levels. number of opioid overdose deaths among hcs residents during the evaluation period as measured by deaths with an underlying cause-of-death being drug overdose (i.e. an underlying cause-of-death icd-10 code in the range x40-x44, x60-x64, x85, y10-y14) where opioids, alone or in combination with other drugs (i.e. a multiple cause-of-death icd-10 code in the range t40.0-t40.4, or t40.6), were determined to be contributing to the drug overdose death. data source: drug overdose deaths are captured by death certificate records; additional medicolegal death investigation records can be used (per established protocol) to determine opioid involvement when specific drugs contributing to the overdose deaths are not listed on the death certificate. number of naloxone units distributed in an hcs community during the evaluation period as measured by the sum of (1) the naloxone units distributed to community residents by overdose education and naloxone distribution programs with support from state and federal funding, including dedicated hcs funding, and (2) the naloxone units dispensed by retail pharmacies located within hcs communities. data source: data are captured from state administrative records and supplemented by study records to include naloxone funded through hcs, as well as iqvia xponent® database. number of hcs residents receiving buprenorphine products approved by the food and drug administration for treatment of opioid use disorder as measured by the number of unique individuals residing in an hcs community who had at least one dispensed prescription for these products during the evaluation period. data source: state prescription drug monitoring program data. number of hcs residents with new incidents of high-risk opioid prescribing during the evaluation period as measured by the number of residents in an hcs community who met at least one of the following four criteria for a new high-risk opioid prescribing episode after a washout period of at least 45 days: (1) incident opioid prescribing episode greater than 30 days duration (continuous opioid receipt with no more than a 7day gap); (2) starting an incident opioid prescribing episode with extended-release or long-acting opioid formulation; (3) incident high-dose opioid prescribing, defined as ≥90 mg morphine equivalent dose over 3 calendar months; or (4) incident overlapping opioid and benzodiazepine prescriptions greater than 30 days over 3 calendar months. association between state laws facilitating pharmacy distribution of naloxone and risk of fatal overdose the medicaid outcomes distributed research network (modrn) innovative solutions for state medicaid programs to leverage their data, build their analytic capacity, and create evidence-based policy claims database council estimated prevalence of opioid use disorder in massachusetts multiple sources of prescription payment and risky opioid therapy among veterans association between opioid prescribing patterns and opioid overdose-related deaths the effect of incomplete death certificates on estimates of unintentional opioid-related overdose deaths in the united states trends in opioid use disorder and overdose among opioid-naive individuals receiving an opioid prescription in massachusetts from annual surveillance report of drugrelated risks and outcomes-united states opioid overdose. data resources. analyzing prescription data and morphine milligram equivalents (mme) centers for disease control and prevention (cdc), 2020. nchs data quality measures an examination of claims-based predictors of overdose from a large medicaid program kasper controlled substance reporting guide, . kentucky cabinet for health and family services nonfatal opioid overdose standardized surveillance case definition no shortcuts to safer opioid prescribing cdc guideline for prescribing opioids for chronic pain--united states opioid prescriptions for chronic pain and overdose: a cohort study electronic code of federal regulations, 2020a. §2.16 security for records electronic code of federal regulations, 2020b. e-cfr website all-payer claims databases -uses and expanded prospects after gobeille mandatory access prescription drug monitoring programs and prescription drug abuse vital signs: changes in opioid prescribing in the united states characteristics of us counties with high opioid overdose mortality and low capacity to deliver medications for opioid use disorder a perspective on medicolegal death investigation in the united states medical examiner and coroner systems: history and trends drug overdose deaths in the united states consensus recommendations for national and state poisoning surveillance drive times to opioid treatment programs in urban and rural counties in 5 us states only one in twenty justice-referred adults in specialty treatment for opioid use receive methadone or buprenorphine injectable extended-release naltrexone for opioid dependence: a double-blind, placebocontrolled, multicentre randomised trial medication for opioid use disorder after nonfatal opioid overdose and association with mortality: a cohort study opioid epidemic or pain crisis? using the virginia all payer claims database to describe opioid medication prescribing patterns and potential harms for patients with cancer health communication campaigns to drive demand for evidence-based practices and reduce stigma in the healing communities study pharmacy reporting and data submission methadone maintenance therapy versus no opioid replacement therapy for opioid dependence buprenorphine maintenance versus placebo or methadone maintenance for opioid dependence prescription opioid duration of action and the risk of unintentional overdose among patients receiving opioid therapy medical examiners' and coroners' handbook on death registration and fetal death reporting u.s. standard certificate of death new york department of health ohio data submission dispenser guide rise and regional disparities in buprenorphine utilization in the united states prescription drug monitoring program training and technical assistance center df#:~:text=in%202003%2c%20doj%20began%20the%20harold%20rogers%20prescri ption,were%20interested%20in%20establishing%2c%20implementing%2c%20and%20 enhancing%20pdmps. accessed on prescription drug monitoring program training and technical assistance center (pdmp ttac), 2020. pdmp policies and practices potentially inappropriate opioid prescribing, overdose, and mortality method to adjust provisional counts of drug overdose deaths for underreporting. division of vital statistics corrected us opioid-involved drug poisoning deaths and mortality rates patterns of buprenorphine-naloxone treatment for opioid use disorder in a multistate population variation in adult outpatient opioid prescription dispensing by age and sex -united states drug and opioid-involved overdose deaths -united states characteristics of initial prescription episodes and likelihood of long-term opioid use -united states methodological complexities in quantifying rates of fatal opioid-related overdose drug overdose deaths: let's get specific mortality risk during and after opioid substitution treatment: systematic review and meta-analysis of cohort studies medications for opioid use disorder substance abuse mental health services administration (samhsa), 2020. certification of opioid treatment programs (otps) us surgeon general's advisory on naloxone and opioid overdose department of health and human services (hhs), 2019. hhs guide for clinicians on the appropriate dosage reduction or discontinuation of long-term opioid analgesics fda identifies harm reported from sudden discontinuation of opioid pain medicines and requires label changes to guide prescribers on gradual opioid overdose rates and implementation of overdose education and nasal naloxone distribution in massachusetts: interrupted time series analysis identifying opioid overdose deaths using vital statistics data state variation in certifying manner of death and drugs involved in drug intoxication deaths evidence-based practices in the healing communities study drug alcohol depend drug data. www.wolterskluwercdi.com. accessed on community dashboards to support datainformed decision making in the healing communities study table 1. healing communities study primary and secondary outcome measures for hypothesis testing all authors contributed to the development of the hcs measures, the development of the framework for the manuscript, and the editing of the manuscript. s. slavova, j. villani, and s.l.walsh drafted the introduction, s. slavova drafted the methods, m.r. larochelle developed the table, and each author participated in drafting parts of the results or discussion sections. key: cord-284786-pua14ogz authors: coker, eric s.; cavalli, laura; fabrizi, enrico; guastella, gianni; lippo, enrico; parisi, maria laura; pontarollo, nicola; rizzati, massimiliano; varacca, alessandro; vergalli, sergio title: the effects of air pollution on covid-19 related mortality in northern italy date: 2020-08-04 journal: environ resour econ (dordr) doi: 10.1007/s10640-020-00486-1 sha: doc_id: 284786 cord_uid: pua14ogz long-term exposure to ambient air pollutant concentrations is known to cause chronic lung inflammation, a condition that may promote increased severity of covid-19 syndrome caused by the novel coronavirus (sars-cov-2). in this paper, we empirically investigate the ecologic association between long-term concentrations of area-level fine particulate matter (pm(2.5)) and excess deaths in the first quarter of 2020 in municipalities of northern italy. the study accounts for potentially spatial confounding factors related to urbanization that may have influenced the spreading of sars-cov-2 and related covid-19 mortality. our epidemiological analysis uses geographical information (e.g., municipalities) and negative binomial regression to assess whether both ambient pm(2.5) concentration and excess mortality have a similar spatial distribution. our analysis suggests a positive association of ambient pm(2.5) concentration on excess mortality in northern italy related to the covid-19 epidemic. our estimates suggest that a one-unit increase in pm(2.5) concentration (µg/m(3)) is associated with a 9% (95% confidence interval: 6–12%) increase in covid-19 related mortality. valley, an extension of flat river lands enclosed between the alps and apennines mountains, which causes the stagnation of pollutants due to low ventilation (giulianelli et al. 2014) . these factors help to characterize the po valley's peculiarity with respect to different european areas with comparable urban and industrial density levels (eeftens et al. 2012) . moreover, in addition to the urbanized and industrial areas, the remainder of the valley presents an intensive agricultural activity. local studies on emission sources highlight a varying composition of the final concentration values depending on the position of monitoring stations and with different sources acting as local or diffused ones (for instance having high emissions from traffic close to cities, while having background biomass burning diffused in the whole region) (bigi and ghermandi 2016; larsen et al. 2012) . indeed, given the eu ambient air quality directives that sets the air quality standards for the protection of health at 25 μg/m 3 for the averaging period of a calendar year, the po valley shows values consistently near or above the threshold. these values often range in the 25-30 μg/m 3 interval with peaks of > 30 μg/m 3 , which in europe are only matched in southern poland and other smaller eastern european clusters (eea 2019). compared to its overall representation in the population, lombardy is disproportionately impacted by covid-19 related mortality, with approximately 53% of italy's covid-19 deaths as of april 15, 2020 (odone et al. 2020) . lombardy is also the most impacted italian region as far as the total number of deaths in excess in the first quarter of 2020 compared to the same period of the previous years. comparing the official covid-19 death data with registry deaths, it emerges that the latter is almost 70% larger than the former in lombardy, 27% larger in emilia-romagna and 18% and 16% in veneto and piemonte, respectively. it is, therefore, imperative to consider the role that pm may have played in such disproportionate covid-19 deaths in northern italy. there are a number of plausible pathways by which airborne pm may impact covid-19 related morbidity and mortality. existing data already finds a strong positive correlation between viral respiratory infection incidence and amb