key: cord-265680-ztk6l2n2 authors: deng, j; peng, z y; wen, z x; dong, g q; xie, m x; xu, g g title: high covid-19 mortality in the uk: lessons to be learnt from hubei province – are under-detected “silent hypoxia” and subsequently low admission rate to blame? date: 2020-08-31 journal: qjm doi: 10.1093/qjmed/hcaa262 sha: doc_id: 265680 cord_uid: ztk6l2n2 nan and hospitals overwhelmed. then, 16 fangcangs were built and 15 used to manage about a quarter of all covid patients in the provincial capital city. these enabled admissions of most mildly or moderately infected patients. there, vital signs were regularly checked and, based on the chinese national guideline for covid, 3 nasal o 2 was supplied to those whose spo 2 became ≤93% [but not severe enough for icu admission]. fangcangs' peak bed usage was over 95%. with centralised isolation and timely treatment to prevent transmission and deterioration of the infection, and with occasional transfers of patients with worsening symptoms to icu, this drastically decreased the mortality over the entire epidemic in hubei [ table 1 ]. after a 76-day lockdown was lifted on 8 april and between may 14 and june 1, wuhan tested 9,899,828 residents to screen "hidden" sars-cov-2 infections. that was virtually the entire population of the city tested when including those already tested since the outbreak in january and excluding those who had left the city during the spring festival from about 10 january and not returned since the lockdown on 23 january. as a result, no new cases were found, with only https://mc.manuscriptcentral.com/qjm the 68,135 confirmed cases in table 1 is a highly reliable reflection of the epidemic in hubei after the initial chaotic statistics in january. in contrary, the uk had more time to prepare, with a medical system seemingly coped well with the already peaked pandemic. however, by 31 july 2020, it has suffered a population mortality rate that is considerably higher than hubei has. we should treat the ratio in table 1 we share the above lessons and experiences learnt from hubei and would like to provoke discussion to address the paradox seen in the uk and other regions. for example, by the end of july, new york had most covid deaths in the us with 32,683 fatalities, yet a "nightingale" hospital costing $52 million treated only 79 virus patients. 9 without effective and safe drugs and/or vaccines facing imminent covid resurgences, the hubei approaches may be worth considering. the (uk) office for national statistics. comparisons of all-cause mortality between european countries and regions 42 per cent of oxygen-supported beds set aside for coronavirus are empty. mail online national health commission of the people's republic of china. new diagnosis and treatment scheme for novel coronavirus infected pneumonia second wave coronavirus narrative fails to hold clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study critical care crisis and some recommendations during the covid-19 epidemic in china on behalf of chinese thoracic society and chinese association of chest physician) ct imaging of the covid-19 this hospital cost $52 million. it treated 79 virus patients. the new york times key: cord-326599-n0vmb946 authors: leung, char title: the difference in the incubation period of 2019 novel coronavirus (sars-cov-2) infection between travelers to hubei and non-travelers: the need of a longer quarantine period date: 2020-03-18 journal: infection control and hospital epidemiology doi: 10.1017/ice.2020.81 sha: doc_id: 326599 cord_uid: n0vmb946 data collected from the individual cases reported by the media were used to estimate the distribution of the incubation period of travelers to hubei and non-travelers. upon the finding of longer and more volatile incubation period in travelers, the duration of quarantine should be extended to three weeks. an epidemic of viral pneumonia started in wuhan, the capital of hubei province in china, in december 2019. a new coronavirus was identified and named by the world health organization as sars-cov-2. it has been found that it is genetically similar to sars-cov and mers-cov 1 . recently, snakes have been suggested as the natural reservoirs of sars-cov-2, assuming that the huanan seafood wholesale market in wuhan is the origin of the virus 2 . different preventive measures have been implemented by health authorities with the 14-day quarantine being the commonly used. while previous studies have estimated the incubation period of sars-cov-2 to help determining the length of quarantine, it has recently been observed that some patients rather had mild symptoms such as cough and low-grade fever or even no symptoms 3 and that the incubation period might have been 24 days 4 , constituting greater threats to the effectiveness of entry screening. against this background, the present work estimated the distribution of incubation periods of patients infected in and outside hubei. because the details of most cases were reported by the media and were not available on the official web pages of the local health authorities in china, three searches for individual cases reported by the media between 20 th january 2020 and 12 th february (first cases outside hubei reported on 20 th january 2020) with search terms "pneumonia" and "wuhan" and "age" and "new" in chinese were performed on google from 7 th , 8 th , and 9 th february. the inclusion of the search term "age" intended to narrow down the search results since the presence of "age" in an article implied a description of an individual case. individual cases with time of exposure and symptom onset as well as type of exposure were eligible for inclusion. there was no language restriction. since most patients did not have complete information about the source of infection, the time of exposure was allowed to be a time interval within which the exposure was believed to lie. in contrast, patients could recall the exact date of symptom onset. the present paper considered two types of exposure, (i) traveling to hubei, china, and (ii) contact with the source of infection such as an infected person or places where infectious agents stayed. for data accuracy, only confirmed cases outside hubei province and within china were considered. the following data were abstracted, (i) location at which the case was confirmed, (ii) gender, (iii) age, where f and s were the cdf of the incubation period and the time of symptom onset, respectively. therefore, to find the maximum likelihood estimates of  , the maxima of the sum of the individual log-likelihood functions, either the results of maximum likelihood estimation are shown in table 1 . the aic suggested that the weibull distribution provided the best fit to the data. both indicator variables of the shape and scale parameters were significant in the weibull model, suggesting different incubation period distributions between the two groups of patients. [ table 1 here] the very first observation of the incubation period of sars-cov-2 came from the national health such difference might be due to the difference in infectious dose since travelers to hubei might be exposed to different sources of infection multiple times during their stay in hubei. in contrast, patients with no travel history to hubei were temporarily exposed to their infected relatives, friends or colleagues with mild or even no symptoms. it is possible that the incubation period of non-travelers was highly volatile, as suggested by the higher variance in the gamma model that provided slightly poorer fit. this could potentially pose a threat to the effectiveness of the existing preventive measures. the duration of quarantine period must be considered with caution. as a comparison, previous studies on the incubation period for sars-cov-2 are shown in table 2 . the 95 th percentiles reported in previous studies varied between 10.3 and 13.3 days, consistent with the current practice of quarantine period of 2 weeks. however, the present study found that the 95 th percentile of non-travelers could be 14.6 days and up to 17.1 days under 95% level of confidence. coupled with the high variability of the incubation period, it is suggested that the duration of the quarantine period of 3 weeks is deemed more suitable. [ a novel coronavirus from patients with pneumonia in china homologous recombination within the spike glycoprotein of the newly identified coronavirus may boost cross-species transmission from snake to human a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster clinical characteristics of 2019 novel coronavirus infection in china estimating incubation period distributions with coarse data early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia incubation period of 2019 novel coronavirus (covid-19) infections among travellers from wuhan key: cord-291750-4s93wniq authors: lv, boyan; li, zhongyan; chen, yajuan; long, cheng; fu, xinmiao title: global covid-19 fatality analysis reveals hubei-like countries potentially with severe outbreaks date: 2020-04-14 journal: j infect doi: 10.1016/j.jinf.2020.03.029 sha: doc_id: 291750 cord_uid: 4s93wniq • cfr in iran in the early stage of the outbreak is highest among all the countries. • cfrs in the usa and italy are similar to hubei province in the early stage. • cfrs in south korea are similar to outside hubei, indicating less severity. • our findings highlight the severity of outbreaks globally, particular in the usa. the outbreak of 2019 novel coronavirus diseases (covid-19) is ongoing in china, 1 but appears to reach late stage and also just starts to devastate other countries. 2 as of 13 march 2020, there have been 80991 confirmed covid-19 cases and 3180 deaths in china, much higher than those outside china with 51767 confirmed cases and 1775 deaths. 3 however, the daily increase in covid-19 cases outside china has greatly surpassed that inside china 3 (over 70 0 0 verse 11 on 13 march), and therefore people raise deep concerns about the outbreaks outside china. here we attempted to uncover their characteristics by comparative analysis on crude fatality ratios (cfrs). we collected data of the officially released cumulative numbers of confirmed cases and deaths (from 23 january to 13 march 2020) with respect to mainland china, epicenter of the outbreak (i.e., hubei province and wuhan city), outside hubei (in china) and outside wuhan (in hubei), as well as to typical countries reported with a substantial number of deaths including south korea, japan, iran, italy, usa, france and spain ( fig. 1 ) . cfrs in hubei and wuhan are significantly higher than those outside hubei and outside wuhan, and they are relatively higher in the early stage of outbreaks than in the late stage ( fig. 1 a) , in line with earlier comprehensive reports by china cdc and who. 4 , 5 the outbreaks outside china are overall lagging approximately one month behind china ( fig. 1 b vs fig. 1 a) . cfr in iran in the early stage (from 21 february to late march) is extremely high while cfr in korea is low and stable over time. notably, cfr in iran has significantly decreased since 2 march while cfr in italy increased a lot in the past 10 days. cfrs in a period of 10-day, i.e., from 23 march to 1 february for china and other specific periods for countries outside china (for detail, refer to s1.xls file), were plotted as mean ±sd at 95% confidence intervals (in the black box), with median being shown as short lines. statistics were performed using spss with anova algorithm, and significance levels ( p value) for all the pairs are shown in table s1 . p values larger than 0.05 between wuhan/hubei and other countries are colored in red, indicating no significant difference (i.e., somehow being similar to each other) and the relative severity of the epidemic therein; p value between outside hubei and south korea is 0.55 (colored in blue), indicating relatively mild or controllable epidemic in south korea. next, we performed comparative statistical analysis on cfrs in a period of 10 days in the early stage of outbreaks between outside china and china. in particular, two periods were set for iran and italy in order to fully cover their changing trends (for detail, refer to s1.xls file). results displayed in fig. 1 c revealed i) cfrs in iran, italy and usa in the past ten days are not significantly different from hubei ( p being 0.24, 0.648 and 0.281, respectively); ii) cfr in usa is not significantly different from wuhan to marginal degree ( p being 0.0504); iii) cfr in iran from 22 february to 2 march is significantly different from any regions of china (p < 0.001; table s1 ). in view of the detailed p values among all pairs (table s1 ), we suppose the ranking for the severity of covid-19 outbreaks in different countries/regions in terms of cfrs as follows: iran > wuhan > hubei ≈usa ≈italy > outside wuhan ≈spain ≈japan ≈france > south korea ≈outside hubei. as cfr is defined as the number of deaths (numerator) among the number of confirmed cases (denominator), both increase of numerator and decrease of denominator lead to higher cfr. in hubei/wuhan there were neither sufficient covid-19 test kits for infection identification nor enough beds in hospitals for effective treatments of patients in the early stage of the outbreak. 6 these shortages led to numerous transmissions in households, reduced the apparent number of cumulative confirmed cases and caused mild patients without treatments to become severe/critical ones and even die, as implicated by earlier reports. 4 , 7 as such, cfrs in hubei/wuhan was relatively high in the early stage. 5 , 7 similar cfrs between hubei and usa/italy, suggest that these countries may face similar situations at present as hubei had experienced before. in support of this, recent news reports show that italy is extremely short of medica resources (beds and acute care equipment) while usa has some problems in covid-19 testing capacity. 8 in iran, these problems might be even more severe such that its cfr is extremely high. to fight against the covdi-19 outbreaks in these hubei/wuhan-like countries, governments may need to implement control measures and timely supply medical resources as hubei/wuhan had done in the past month. 2 , 4 emergence of a novel coronavirus causing respiratory illness from wuhan who: coronavirus disease (covid-19) outbreak . available from who: coronavirus disease (covid-2019) situation reports report of the who-china joint mission on coronavirus disease the novel coronavirus pneumonia emergency response epidemiology team. vital surveillances: the epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19)-china corona virus disease 2019, a growing threat to children potential association between covid-19 mortality and healthcare resource availability cnn: the us is starting to look like italy on coronavirus lockdown this work is support by the national natural science foundation of china (no. 31972918 and 31770830 to xf). authors declare no conflict of interests. supplementary material associated with this article can be found, in the online version, at doi: 10.1016/j.jinf.2020.03.029 . key: cord-352108-py93yvjy authors: tu, lh; li, h; zhang, hp; li, xd; lin, jj; xiong, cl title: birth defects data from surveillance hospitals in hubei province, china, 200l – 2008 date: 2012-03-31 journal: iran j public health doi: nan sha: doc_id: 352108 cord_uid: py93yvjy background: to determine the prevalence and characteristics of birth defects in perinatal infants in hubei province during 200l–2008. methods: the prevalence of birth defects in perinatal infants delivered after 28 weeks or more was analyzed in hubei surveillance hospitals during 200l–2008. results: the incidence of birth defects in perinatal infants from 200l to 2008 was 120.0 per 10,000 births, and was increased by about 41% from 81. 1 in 2001 to 138.5 per 10,000 births in 2008. the incidence in the first 4 years (2005–2008) was much higher than the latter four (2001–2004) (χ(2)=77.64, p <0.05). the difference in prevalence between urban and rural was of no significance in 2008 (χ(2)=0.03, p >0.05), but that between male and female was significant (χ(2)=5.24, p <0.05), as the former prevalence was much higher. the prevalence of birth defects was slightly higher among mothers over 35 years old than those under 35 years old, but with no significance (χ(2)=1.98, p >0.05). the two leading birth defects were cleft lip and/or palate and polydactyly, followed by congenital heart disease, hydrocephaly, external ear malformation and neural tube defects. the prevalence of congenital heart disease was rising. conclusions: eight years’ birth defects data indicate that the birth defect rate was on the rise and the birth defects prevalence in hubei province should be valued. birth defect (bd) is a widely used term for a congenital malformation; it involves the abnormality in morphology, structure, function, metabolism, psychology and behavior. with more epidemics prevented and controlled, it has gradually become the main cause of infant and child mortality. china is one of the countries with high incidence of bd, with 800,000 to 1.2 million infants born with bd and disabilities annually, accounting for four to six percent of the country's total newborn. the chinese government has established a surveillance system for monitoring bd since 1986. twenty-three types of bds were included in the system according to the international classification of diseases clinical modification codes, tenth edition (i.e., . in this research, we analyzed the prevalence and characteristics of bd in perinatal infants in hubei province (central china) during 200l-2008, in order to provide a scientific basis for developing control measures over prevention of bd on perinatal infants. the perinatal infants presented for this report pertained to all cases (livebirths, fetal deaths, or stillbirths after 28 weeks of gestation and terminations of pregnancy for fetal anomaly; accessing within 7 days after delivery) of congenital anomaly from 52 surveillance hospitals in hubei province between january 1, 2001 and december 31, 2008. under the guideline of icd-10, which delineates twenty-three different types of bds, all infants were examined carefully by specialized doctors using routine obstetric diagnosis, physical examination or autopsies and a "bd registration card" would be filled, and reported through the network. statistical analysis was conducted using spss (version 12.0, chicago, il) for the analysis of bd incidence. the chi-squire test was done for the determination of statistical significance. overall bd prevalence 69,408 infants were covered in 52 surveillance hospitals in 2008 and 961 infants were found to have bds, making the prevalence 13.85‰, which exceeded the last 7 years'. besides, the prevalence of the first 4 years (2005-2008) was much higher than the latter four (2001-2004) (χ 2 =77.64, p <0.05). we analyzed the prevalence of infants classified to residence, gender and maternal age in 2008. the difference of prevalence between urban and rural was of no significance (χ 2 =0.03,p >0.05), but that between male and female was significant (χ 2 =5.24, p <0.05), as the former prevalence was much higher. the prevalence was slightly higher among mothers over 35 years old than those under 35 years old, but with no significance (χ 2 =1.98, p >0.05) ( table 1) . the two leading bds were cleft lip and/or palate (clp, without cleft palate) and polydactyly during 2001-2008; the prevalence fluctuated around 1.7 ‰ and 1.3‰ respectively. in 2008, congenital heart defect (chd), hydrocephaly, external ear malformation (eem) and neural tube defects (ntds, including anencephalia, rachischisis and encephalocele) followed the top two bds. the above top six bds constituted over the half of all bds in 2008, with a much higher level (64%) in 2003. the prevalence of chd was rising from top nine in 2001 to top three in 2008, as 0.27‰ to 1.28‰. however, the prevalence of ntds was decreased sharply in 2008. (fig.1 in general, prenatal diagnosis rate of bd was increased, such as chd, as its rate was increased to 82.95% in 2008 from 20% in 2001. (fig.3) department of public health spot-checked nine cities and seventeen counties in 2008, finding that the rate of missing report of surveillance hospitals was 15.72%. (1). the prevalence of 2008 was 13.85‰, a little higher than the national average prevalence (13.49‰). the data indicated an increase of bd prevalence, the same trend as the whole nation (2) . to speculate, it may relate to several things. first, surveillance system was improved to make the data more genuine, especially with the advanced prenatal diagnostic techniques and expertise skills. second, as the forcible premarital health assessment was cancelled, the risks of bd were increased. furthermore, as modernization of the city, pollution and other modern life-related risks and lifestyle (e.g. delayed childbearing) (3) were responsible too. the prevalence in urban and rural were of no significant difference in 2008, different from the earlier data published before 2008 which indicated that prevalence in rural was a little higher than in urban (1). the enhanced maternal and child health care in rural may contribute to the offset. the difference of prevalence between male and female was significant; the former prevalence was much higher, which paralleled with the data previously published (4). advanced maternal age (over 35 years old) was related to a higher bd prevalence, but with no significance compared to the younger group, different from a majority of the data previously published (4, 5) . the recent emphasis on the antenatal care and prenatal screening for pregnant women over 35 years old might be partly related to this difference. from 2002 to 2008, clp and polydactyly were continuously on top two. this may relate to genetic and environmental factors. genetic factors contributing to clp have been identified for some syndromic cases and many genes associated with syndromic cases of clp have also been identified to contribute to the incidence of isolated cases of clp (6) . besides, clp and other congenital abnormalities have been linked to maternal hypoxia, as caused by e.g. maternal smoking (7-9), maternal alcohol abuse or some forms of maternal hypertension treatment (10) . integration of genetic and environmental risk using different methods may generate a synthesis that will both better characterize etiologies, as well as provide access to better clinical care and prevention (11) . a case-control study by jy luo, et al. (12) on polydactyly showed that heredity was the foremost risk factor. at the same time, chd was growing year after year, ranking top three in 2008, which was partly due to improved prenatal diagnostic techniques and skills and increased environmental risks. with above three bds, hydrocephaly, eem and ntds were still the main bds and constituted over 50% of all bds, while shooting up to 64% in 2003 probably due to severe acute respiratory syndrome (sars) outbreak, which made hospitals screening for bd in a much more cautious way. the top six bds in hubei province were similar to national data, but their prevalences were all higher than national average ones except chd (2) . data published on annual report of the national maternal and child health care surveillance and communications in june, 2009 indicated that the prenatal diagnosis rate in hubei province in 2008 was 14.29%, a bit lower than eastern coastal cities and provinces. all the data showed that there was more the maternal and child health care could do to improve the present situation. as periconceptional folate supplementation has a strong protective effect against ntds (13) , pregnant woman has been encouraged to take folic acid supplements to prevent ntds by doctors. actually, the decrease of ntds did happen which was owe mainly to the folic acid supplements taking measure (14) . to reduce the overall prevalence of bd, it is necessary to keep the surveillance system function properly and provide prevention and health care service extensively. to reduce the rate of missing report of surveillance hospitals and make the bd data much more convenient and accessible, it is of help to draw lessons from the european network of registers for the epidemiologic surveillance of congenital anomalies, the eurocat (15). besides, primitive maternal and child health care service should be easily approachable for women of childbearing age, especially those living in rural areas (16) . furthermore, studies on the integration of genetic and environmental risks, especially the links between environmental geochemistry and prevalence of bd (17) , are also important for better characterizing etiologies, as well as providing access to better clinical care and prevention. eight years' bd data indicate that the bd prevalence was rising and the bd prevalence in hubei province should be valued; prevention program of bd shall be better performed to decrease prevalence of birth deformation in perinatal infants based on improved perinatal care and prenatal diagnosis. ethical issues (including plagiarism, informed consent, misconduct, data fabrication and/or falsification, double publication and/or submission, redundancy, etc) have been completely observed by the authors. analysis of birth defects in 1997-2002 in hubei province annual report of the national maternal and child health care surveillance and communications leading a healthy lifestyle: the challenges for china analysis of surveillance of birth defect on perinatal fetuses in hubei province during 2001-2005 analysis of the report of perinatal birth defects monitoring in wuhan city from 2004 to 2007. maternal and child health care of china genetic causes of nonsyndromic cleft lip with or without cleft palate maternal cigarette smoking and the associated risk of having a child with orofacial clefts in china: a case-control study review on genetic variants and maternal smoking in the etiology of oral clefts and other birth defects tobacco smoking and oral clefts: a metaanalysis transverse limb deficiency, facial clefting and hypoxic renal damage: an association with treatment of maternal hypertension cleft lip and palate: understanding genetic and environmental influences a case-control study on genetic and environmental factors regarding polydactyly and syndactyly prevention of neural-tube defects with folic acid in china. china-u.s. collaborative project for neural tube defect prevention national neural tube defects prevention program in china paper 1: the eurocat network--organization and processes accessibility of primary health care workforce in rural china links between environmental geochemistry and rate of birth defects: shanxi province we thank all staff members involved in birth defects monitoring in hubei province and the china birth defect monitoring center. the authors declare that there is no conflict of interests. key: cord-273531-q9ah287w authors: li, yang; duan, guangfeng; xiong, linping title: characteristics of covid-19 near china's epidemic center date: 2020-06-26 journal: am j infect control doi: 10.1016/j.ajic.2020.06.191 sha: doc_id: 273531 cord_uid: q9ah287w background: this study described and analyzed the age, gender, infection sources, and timing characteristics of the 416 confirmed cases in two cities near the center of china's covid-19 outbreak. methods: this study used publicly available data to examine gender, age, source of infection, date returned from hubei, date of disease onset, date of first medical visit, date of final diagnosis, and date of recovery of covid-19 cases. results: public-use data revealed similar risks of infection by age and that the numbers of new and final diagnoses of confirmed cases first increased, peaked at about two weeks, and then gradually decreased. the main sources of infection were firsthand or secondhand exposure in hubei province and contact with confirmed cases, which mostly involved contact with infected household members. the mean periods from disease onset to first medical visit, first visit to final diagnosis, and final diagnosis to recovery were 4.44, 3.18, and 13.42 days, respectively. conclusions: the results suggest that the measures taken to control the rate of infection were effective. prevention and control efforts should respond as quickly as possible, isolate and control activities of individuals leaving infected areas, and restrict household contact transmission. the first novel coronavirus pneumonia (covid19) case was identified in wuhan, hubei province, china, on december 12, 2019, after which the disease gradually spread. 1 the emergence of the covid-19 epidemic coincided with the traditional chinese spring festival when most migrant workers return to their hometowns to celebrate. covid-19's novel infection presented few obvious upper respiratory symptoms (such as nasal discharge, sneezing, or sore throat), indicating the virus mainly was infecting the lower respiratory tract, 2,3 and most patients' first symptom was fever. the mode of transmission might have been by droplets, close contact, aerosol, mother-infant, or fecal-mouth transfer. during the incubation period, patients could transmit the virus to other humans. [4] [5] [6] [7] [8] [9] [10] as of february 22, 2020, 29 countries had reported confirmed cases of covid-19, of which china reported 76,936 confirmed cases, 22,888 recovery cases, and 2,442 deaths. 11, 12 according to the research reports, covid-19 is highly infectious, 13 and the large-scale population migration associated with the spring festival exacerbated the spread of the disease to outlying areas. xinyang city is in southern henan province, china, on the northern border of hubei province, and fuyang city is in northwest anhui province, adjacent to xinyang city. xinyang and fuyang are typical labor exporting cities near the epidemic's center. 14, 15 this study investigated aspects of the covid-19 transmission regarding xinyang and fuyang, described its characteristics, and evaluated the prevention and control measures. china's data on covid-19 are gathered based on its classification as a class b infectious disease. class b infectious diseases are considered mandatory notifiable diseases; all new cases must immediately be reported using the national infectious diseases monitoring information system database. to prevent rapid spread of the disease, the municipal health departments publicized information about the confirmed cases on the governments' websites, including personal information, personnel exposure, and the disease trajectory. we downloaded the case information from the target cities' health commission websites and transformed it into numerical data. the variables used in the analysis were: gender, age, source of infection, date returned from hubei, date of disease onset, date of first medical visit, date of final diagnosis, and date of recovery. as of february 22, 2020, 416 cases of effective data were collected in the two cities 16,17 : 270 cases in xinyang and 146 cases in fuyang. the sources of infection were: (1) firsthand or secondhand contact with hubei ("hubei exposure"), (2) "confirmed case contact," (3) "non-hubei returnee exposure," and (4) "others." "hubei exposure" comprised confirmed cases of individuals who had recently left hubei province or had not recently left hubei but had been in contact with asymptomatic individuals who had been in hubei province. "confirmed case contact" refers to infected individuals who had not left their residential areas and they had been in close contact with individuals who were confirmed cases. "non-hubei returnee exposure" refers to individuals who had recently returned to xinyang or fuyang from non-hubei provinces. in this study, a "returnee" was an individual who had returned to xinyang or fuyang from some other location, and "non-returnee" referred to an individual who had not left xinyang or fuyang. the age, gender, trajectory, and rates of infection distributions of the 416 confirmed cases in xinyang and fuyang were analyzed. the distribution of confirmed cases in households was analyzed to describe the extent of covid-19 clustered within household units. using the data on timing of disease onset and final diagnosis, the covid-19 development over time was investigated. regarding the disease trajectory (confirmed cases), four temporal stages were identified: (1) arrival → disease onset, (2) disease onset → first medical visit, (3) first medical visit → final diagnosis, and (4) final diagnosis → recovery. the mean periods of each stage were described and analyzed. ibm spss 22.0 was used for data analysis. as of february 22, 2020, 429 cases had been confirmed in xinyang and fuyang. in xinyang, 184 of the 274 cases were in recovery (67.15%), and two had died (0.73% fatality rate); in fuyang, 99 of the 155 cases were in recovery (63.87%), and no deaths were reported. thus, on that date, there were 88 and 56 ongoing cases in xinyang and fuyang, respectively. however, 13 cases were not included due to incomplete information. among the confirmed cases with complete effective data (n=416), 236 were male (56.73%) and 180 were female (43.27%). the proportional age distribution was zero-18 years old (5.05%), 19-59 years old (78.84%), and 60 years or older (16.11%). the proportion of confirmed cases aged 19-59 years in the returnee group was higher than among the non-returnees, and the proportions of confirmed cases aged zero-18 and aged 60 or older among the non-returnees were higher than among the returnees. table 1 presents the distributions regarding age, gender, source of infection, and within-household transmission and figure 1 illustrates the disease trajectory between onset and final diagnosis from january 11, 2020, through february 22, 2020, and the disease trajectory from january 23, 2020, through february 22, 2020, by source of infection. figure 1a illustrates that the first day of disease onset in the two cities was january 11, 2020, after which the number of confirmed cases gradually increased. the disease onset peak was january 25 through january 30 and then the number of newly confirmed cases gradually decreased. the first final diagnosis was on january 23, 2020, the numbers of final diagnoses gradually increased, they peaked january 31 through february 5, and they gradually decreased from that date. the peak of the final diagnoses was about six days after the peak of disease onset. hubei exposure was the source of 213 (51.20%) confirmed cases, 108 (25.96%) cases were confirmed case contacts, non-hubei returnee exposure accounted for 32 (7.69%) cases, and there were 63 (15.14%) cases due to other sources. figure 1b shows that the main source of infection before february 7 was hubei exposure, and, after february 7, the main source of infection was confirmed case contact. regarding within-household infection, 51 households (with 130 confirmed cases) experienced within-household transmission based on multiple infected household members (table 1) . of them, 33 households had two, 13 households had three, two households had four, one household had five, and two households had six infected household members (64.71%, 25.49%, 3.92%, 1.96%, and 3.92% of the households with more than one infected household member, respectively). the mean number of people infected in the households with more than one infected member was 2.55. returnees. the period between date of first visit and date of final diagnosis was slightly longer for non-returnees than returnees, and the period between date of final diagnosis and date of recovery was slightly longer for returnees than non-returnees. numbers less than zero mean that some confirmed cases did not present symptoms at the time of first medical visit. among the returnees, the proportion of confirmed cases aged 19-59 years was 87.37%, whereas the proportion of those aged 60 years or older and 0-18 years was just 12.64%. in comparison to the returnees, the proportion of non-returnees aged 19-59 years was 16.28% higher and the proportion of those aged 60 or older was 13.38% lower. the returnees' male to female sex ratio was 1.64:1, and the male to female sex ratio among non-returnees was 1.08:1. these results might reflect the fact that migrant workers are most likely to be males aged years, which means that there were relatively higher proportional representations of females and older people among the non-returnees. moreover, this finding indicates that people of all ages are susceptible to covid-19. the numbers of final diagnoses of confirmed cases peaked within 14 days of onset and then gradually decreased until february 22 when just one case was diagnosed. this finding demonstrates that the spread of the virus had effectively been controlled through various measures, such as isolating exposed people, reducing public gatherings, increasing screenings for fever, and widespread public dissemination of prevention and control information. this study's analysis revealed that the main source of confirmed cases was hubei exposure or confirmed case contact. during the first half of the outbreak period, hubei exposure was the likeliest source and, during the second half of the outbreak period, confirmed case contact was the likeliest source of infection. previous studies have found that close contact with infected individuals tended to carry a high risk of infection. 18, 19 the present study found that, of the confirmed cases whose source of infection was "confirmed case contact," 72.22% were via household contact with one or more confirmed cases. 25 studies have shown that the number of infected cases was significantly reduced by controlling the city's traffic, closing entertainment venues, and banning public gathering. implementing these measures can limit the progression of the epidemic. 26 by further controlling within-household contact with infected people and the size of public gatherings, incidence might be further decreased. the key to controlling infectious diseases is early detection, reporting, isolation, and treatment. we found that the mean period from date of return to the study area and date of disease onset was 6.69 days (range -7 1 -33). twenty confirmed cases among the returnees (10.1%) had symptoms before they arrived in xinyang or fuyang, suggesting that one of the first steps to take should be to assertively control of workers' abilities to return home which, in the early stage, might slow the rate of infection. the mean period from date of disease onset to first medical visit was about 4.44 days (range: -3 2 -21). two cases did not have symptoms at the time of first medical treatment (screening). li et al. found that the mean interval between date of disease onset and date of first visit was 5.8 days (cases with onset before january 1, 2020) or 4.6 days (onset from january 1 through january 11). 27 we found a slightly shorter period, implying that public awareness of covid-19 and medical treatment had gradually improved and people were increasingly likely to seek treatment. moreover, the mean period between date of disease onset and date of first visit 1 numbers less than zero indicate that some confirmed cases had symptoms reflecting disease onset before they returned home. 2 the negative number indicates that some cases did not present symptoms at the time of first medical treatment. among non-returnees was slightly longer than among returnees, indicating that quarantine and isolation measures were slightly stronger for returnees than non-returnees. we found that the mean period between date of first visit and date of final diagnosis was 3.18 days, suggesting that the efficiency of early detection measures needed improvement. in addition, the mean period between date of final diagnosis and date of recovery was about 13.42 days (range: 5-25). the mean hospital stay was 10 days in a previous study, 28 but it was slightly longer in our study. effective responses to covid-19 for prevention and control required implementation of governmental measures, which apparently controlled the rate of infection in xinyang and fuyang, which are cities with significant flows of migrant workers to and from hubei province. the key to controlling the rate of infection via returnees is to act as quickly as possible, focus on isolating and controlling returnees' mobility, and decreasing close within-household contact between infected and non-infected household members. if these measures were implemented as a preemptive first step, the rate of infection would further be reduced. the funder had no role in the study's design, data collection, analysis, the decision to publish epidemiology working group for ncip epidemic response. an update on the epidemiological characteristics of novel coronavirus pneumonia (covid-19) clinical features of patients infected with 2019 novel coronavirus in wuhan a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster first case of 2019 novel coronavirus in the united states national health commission of the people's republic of china. what is fecal-oral transmission? national health commission of the people's republic of china the state council information office of the people's republic of china. press conference of the joint prevention and control of the state council a 30-hour old infant in wuhan diagnosed and mother-to-child infection suspected clinical analysis of 10 neonates born to mothers with 2019-ncov pneumonia world health organization epidemiology working group for ncip epidemic response. the epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19) in china the investigation and research on fuyang's off farm workers national health commission, ministry of human resources and social security, ministry of finance. measures to improve working conditions of and care for physical and mental health of healthcare workers clinical characteristics of 138 hospitalized patients with novel coronavirus-infected pneumonia in wuhan, china henan provincial people's government a new coronavirus associated with human respiratory disease in china the novel coronavirus originating in wuhan, china: challenges for global health governance reduce large-scale gathering activities in wuhan an investigation of transmission control measures during the first 50 days of the covid-19 epidemic in china early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia. the new england journal of medicine 2020 the state council information office of the people's republic of china. press conference of the joint prevention and control of the state council we would like to thank editage (www.editage.cn) for english language editing. the authors declare that they have no competing interests. key: cord-333265-na7f0yam authors: zeng, yiping; guo, xiaojing; deng, qing; zhang, hui title: forecasting of covid-19 spread with dynamic transmission rate date: 2020-08-21 journal: nan doi: 10.1016/j.jnlssr.2020.07.003 sha: doc_id: 333265 cord_uid: na7f0yam abstract the covid-19 was firstly reported in wuhan, hubei province, and it was brought to all over china by people travelling for chinese new year. the pandemic coronavirus with its catastrophic effects is now a global concern. forecasting of covid-19 spread has attracted a great attention for public health emergency. however, few researchers look into the relationship between dynamic transmission rate and preventable measures by authorities. in this paper, the seir (susceptible exposed infectious recovered) model is employed to investigate the spread of covid-19. the epidemic spread is divided into two stages: before and after intervention. before intervention, the transmission rate is assumed to be a constant since individual, community and government response has not taken into place. after intervention, the transmission rate is reduced dramatically due to the societal actions or measures to reduce and prevent the spread of disease. the transmission rate is assumed to follow an exponential function, and the removal rate is assumed to follow a power exponent function. the removal rate is increased with the evolution of the time. using the real data, the model and parameters are optimized. the transmission rate without measure is calculated to be 0.033 and 0.030 for hubei and outside hubei province, respectively. after the model is established, the spread of covid-19 in hubei province, france and usa is predicted. from results, usa performs the worst according to the dynamic ratio. the model has provided a mathematical method to evaluate the effectiveness of the government response and can be used to forecast the spread of covid-19 with better performance. on december 8, 2019, a case of unexplained pneumonia was reported in wuhan. the virus was brought to hubei province and china by people travelling for chinese new year. after the outbreak of new pneumonia, the coronavirus disease was named as covid-19 by the world health organization on february 11, 2020 [1] . in order to control and stop the spread of covid-19, chinese national health commission took strict measures to lockdown the wuhan city and all transportation was suspended to prevent human-to-human contact on february 23, 2020 [2] . epidemiological modeling play a vital role in the early warning and prevention of outbreaks, such as severe acute respiratory syndrome (sars-cov) [3] , middle east respiratory syndrome (mers-cov) [4] , ebola virus [5, 6] , zika virus [7, 8] and so on. until now, a lot of researches were performed on covid-19 [9] [10] [11] . yang et al. [10] calculated the basic reproduction number and the death rate by analyzing the data from infected people, and they found that the basic reproduction number was about 3.77 and the mean incubation period was estimated to be 4.75 days. li et al. [12] analyzed the data for the first 425 confirmed cases in wuhan for the purpose of investigating the epidemiologic characteristics of covid-19. they found that the mean incubation period was 5.2 days (95% confidence interval, 4.1 to 7.0), with the 95th percentile of the distribution at 12.5 days. zhong et al. [13] collected data from 1099 patients confirmed with covid-19 in china. through data analysis, they found that the median incubation period was 4 days (interquartile range, 2 to 7). all studies help us better understand the covid-19 and find the suitable methods to prevent virus and cure individuals. researchers mentioned above focused on the clinic characteristics of covid-19, while others paid attention to modelling and prediction, which could also provide reference for the management of anti-virus. a majority of researchers [1, 14] modelled and reproduced the spread process of virus using the original or modified sir and seir models. natsuko et al. [15] estimated the potential number of novel coronavirus cases in wuhan. from results, there were a total of 1723 cases of covid-19 with onset of symptoms by 12 th january 2020. based on the susceptible-exposed-infected-removed (seir) compartment model, zhou et al. [14] found that the basic reproduction number ranged from 2.8 to 3.3 with the help of dataset reported on the people's daily in china. the predicted value fell between 3.2 and 3.9. xiong et al. [16] analyzed the infected population and spread trend of covid-19 under different policy with the help of seir model, and they found that the epidemic spreading was dominated by the quarantine rate and starting date of intervention. ming et al. [17] explored the effect of covid-19 on healthcare system using mathematical modelling, and found that if there was no effective intervention, the healthcare system burden would be increased with the increased confirmed cases. however, few researchers have taken the dynamic transmission rate into consideration because of varied preventable measures by authorities. in reality, after actions taken by authorities, cities are in lockdown and individuals need to stay at home, resulting in the decrease of the transmission rate. in this paper, the spread of covid-19 is divided into two stages: before and after intervention. before intervention, the transmission rate is assumed to be a constant since individual, community and government response has not taken into place. after intervention, the transmission rate is reduced dramatically due to the societal actions or measures to reduce and prevent the spread of disease. the transmission rate is assumed to follow the exponential function. in this paper, the original and modified seir models are briefly introduced in terms of the transmission rate and removal rate in section 2. in section 3, based on the least square method, the improved model is optimized by considering accumulated number of infected individuals and daily new cases. then we compare and discuss the performance between the original and modified models. using the modified model, the spread of covid-19 in hubei province, france and usa is predicted and compared. conclusions are made in the last section. the original seir (susceptible exposed infectious recovered) model is widely used to predict the spread of epidemic disease. s is the susceptible individuals, and the susceptible individuals s have a probability β to enter the exposed class e after they meet individuals with epidemiological virus by close contact. after some days without any obvious features (the incubation period), some exposed individuals e have a probability of α to show some characteristics of epidemic disease, that is infected individuals. infected individuals i either recover or die, which will be removed from the system. the rate to remove is called γ. the following formula describe the spread process of epidemic disease: where r is the average number of contacts per person per day. in the case of covid-19, individuals s enter the exposed class e because of exposed individuals e and infected individuals i. however, the original model does not consider the fact that exposed individuals have an ability to infect susceptible individuals, which is one of the most important factors for epidemic disease spread. in the modified model, the individuals are affected by exposed individuals e with a probability of β 2 and infected individuals i with β 1 . where r 1 is the average number of individuals with whom an infected individual is confronted by close contact. r 2 is for the exposed individuals. in order to prevent the spread of epidemiological virus, authorities took various measures, such as to lock down city, to control traffic, to wear facemask, to educate the public on the knowledge of the disease, to require potential patients to stay in hospital or stay at home, and so on. the implemented measures can control and stop the spread of virus and save the life. on the other words, after measures, the transmission rates β 1 and β 2 begin to decrease. in our model, we assume that the transmission rate following the following formulation: where τ is the time when the measures are taken to intervene the virus and k is a constant value to control the transition rate. the removal rate in the original model is kept as a constant. however, in reality, as the time progresses, medicine and therapy for curing patients are found, the death rate is reduced gradually. the removal rate will increase with the passage of the time. at the same time, from the data collected, we can find the recovery rate is larger than the death rate. therefore, the following formulation is proposed to show the relationship between the removal rate () t  and time: where a and b are constant values in the model. the basic reproductive number r 0 measures the average number of secondary people infected by a primary patient in a pool of mostly susceptible individuals in absence of controlling measures [18] and it is a parameter to estimate the epidemic spread in a sealed group [19] . for any initial level of epidemic disease, it is going to disappear from the population in the infected area when r 0 is smaller than 1. r 0 is larger than 1, which implies that disease is spreading in the population [20] . there are many ways [21, 22] to calculate the r 0 in terms of formula derivation and model fitting. in our model, the basic reproduction number r 0 is estimated by the formula of r 0 = β/γ for the purpose of simplifying the model. furthermore, because of measures taken by authorities, the basic reproduction number r 0 may vary with the passage of the time. by considering transmission rate β(t) and removal rate γ(t), the effective reproduction number r e (t) is estimated by the following formulation: in order to simplify the model, some hypotheses are made, as shown in fig. 1 : 1) the exposed individuals and infected individuals have same probability to infect susceptible individuals, that is β 1 =β 2 ; 2) there is no pedestrian flow between hubei and outside hubei, and covid-19 spreads in the corresponding area; 3) removed individual from the system has no ability to infect others; 4) the transmission rate β is assumed to follow an exponential function considering the fact that fewer individuals are infected after measures are in placed; 5) the removal rate γ is supposed to follow a power exponent function, and the removal rate increases as the time processes due to the better treatment. 6) the basic reproduction number r 0 changes with the time because of measures taken by authorities. the effective reproduction number is calculated by r e (t)= β(t)/γ(t). the national health commission of the people's republic of china published the accumulative number of the confirmed cases and daily new cases on the official website. by the use of r package [23] , we acquired the datasets of hubei province in respect of accumulative confirmed cases and daily new cases from january 11 to march 19, 2020. for out of hubei province, the first case was reported on january 20. the data of outside hubei province in china was obtained from january 20 to march 19, 2020. the interest is shifted to the possible range of parameters in the model. first of all, the number of infected patients is needed to be confirmed. according to datasets from the national health commission of china and reports by the health commission of hubei province, the number of infected people was 41 in hubei province and the first day was set up as january 11. for the data out of hubei province, january 20 is set up as the first day. furthermore, the number of infected individuals on the first day is 21. for the initial exposed individuals, it is difficult to estimate the number of exposed individuals due to medical techniques. some researchers thought that the rate (infected: exposed) is about 13%. according to the experience, the rate is set to be 10%. initial exposed number of hubei province is therefore set to be 210 and it is equal to 410 for outside hubei province. no individual recovers from the disease on the first day. all initial removal individuals are therefore set to be zero. there is no way or method to define the average number of individuals who an infected individual meets, so r 1 is set to be a value ranging from 0 to 10 according to the value used in sars. because the exposed individuals have a big opportunity to meet others by close contact because of the incubation period. r 2 is set to be within [0, 35] . at last, the transmission rate from the exposed individuals to infected individuals α is set from 0 to 0.6. in addition, with the help of the least square method, other unknown parameters are optimized to fit the real data, which is shown in table 2 . because there is a big difference between hubei and outside hubei provinces in respect of control measures, we estimate the accumulated infected individuals and effective reproduction number r e in terms of hubei and outside hubei provinces. based on the dataset, the model is fitted by simulation. fig. 2 and fig. 3 show the fitted and predicted data of hubei and outside hubei provinces, respectively. from fig. 2 , the spread of epidemiological virus is controlled at the start of march because of fewer new cases in hubei. it is also found that on january 20, there are fewer infected individuals from fig. 3 . furthermore, the fitted parameters are obtained and shown in table 2 . firstly, the transmission rate without measures β 0 is equal to 0.033 and 0.030 for hubei and outside hubei provinces. thus the virus in hubei has a larger probability to infect other individuals. the same rule is suitable for the transmission rate from the exposed individuals to infected individuals α. fig. 4 shows the effective reproduction number r e with passage of the time in hubei and outside hubei provinces, respectively. from fig. 4 , r e is decreased slightly before january 23 due to the increase of the treatment. then r e declines suddenly because authorities implement several control measures and individuals are alerted to prevent epidemiological virus. measures and treatments result in decreasing effective reproduction number with the passage of the time. fig. 4 shows that virus is about to fade way in may in hubei province. also from fig. 4 , virus outside hubei province is going to die out earlier due to secondary or third generation of virus. the results from the original and modified models are compared in terms of r e . from fig. 4 , r e from the original model is larger than that from the modified model. r e is decreased slightly as the time progresses, which cannot perform well because of preventable measures taken by authorities. the daily new cases are compared in hubei and outside hubei provinces, respectively. from fig. 5 and fig. 6 , it is found that daily new cases by the original model are smaller than those by the modified model at the early period of virus spread. underestimation in virus contributes to less attention to the virus, which can result in large damage and casualties for any countries and regions. fig. 7 . it means that the virus is under control. in fig. 7 , the label (actual infected population) means the number of cases confirmed with covid-19 in reality. it is found that, in real life, usa performs the worst. from fig. 7 , the spread of covid-19 in hubei province is under control firstly. in this paper, a modified model is developed to better predict the spread of covid-19 considering the dynamic change of the control measures and treatment. in our model, the transmission rate β is assumed to follow exponential function by considering the fact that fewer individuals are infected after measures to prevent the virus spread. then the removal rate γ is supposed to follow a power exponent function. the removal rate is increased with time because of better cure for disease. based on real data, we optimize the model parameters using the least square method. transmission rate without measures β 0 is equal to 0.033 and 0.030 for hubei and outside hubei. the results from the original and modified models are compared in terms of effective reproduction number r e and daily new cases. it is found that daily new cases obtained by the original model are smaller than those by the modified model at the starting spread of epidemiological virus. fewer infected individuals contribute to less attention to the virus, which may result in large damage and casualty. furthermore, the model is used to evaluate the coronavirus spread of hubei province, france and usa. usa performs the worst according to the ratios. the model has provided a mathematical method to evaluate the effectiveness of the government response and can be used to forecast the spread of covid-19 with better performance. nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study a discrete epidemic model for sars transmission and control in china synthesizing data and models for the spread of mers-cov, 2013: key role of index cases and hospital transmission ebola virus disease in west africa-the first 9 months of the epidemic and forward projections ebola control: effect of asymptomatic infection and acquired immunity estimate of the reproduction number of the 2015 zika virus outbreak in barranquilla, colombia, and estimation of the relative role of sexual transmission zika virus epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study epidemiological and clinical features of the 2019 novel coronavirus outbreak in china, medrxiv epidemiological and clinical features of primary herpes simplex virus ocular infection early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia clinical characteristics of coronavirus disease 2019 in china preliminary prediction of the basic reproduction number of the wuhan novel coronavirus 2019-ncov estimating the potential total number of novel coronavirus cases in wuhan city simulating the infected population and spread trend of 2019-ncov under different policy by eir model breaking down of healthcare system: mathematical modelling for controlling the novel coronavirus (2019-ncov) outbreak in mathematical models in population biology and epidemiology the basic reproductive number of ebola and the effects of public health measures: the cases of congo and uganda global properties of sir and seir epidemic models with multiple parallel infectious stages estimating the basic reproductive ratio for the ebola outbreak in liberia and sierra leone an r package and a website with real-time data on the covid-19 coronavirus outbreak, medrxiv the authors declared that they have no conflicts of interest to this work. key: cord-285965-mar8zt2t authors: su, liang; ma, xiang; yu, huafeng; zhang, zhaohua; bian, pengfei; han, yuling; sun, jing; liu, yanqin; yang, chun; geng, jin; zhang, zhongfa; gai, zhongtao title: the different clinical characteristics of corona virus disease cases between children and their families in china – the character of children with covid-19 date: 2020-03-25 journal: emerg microbes infect doi: 10.1080/22221751.2020.1744483 sha: doc_id: 285965 cord_uid: mar8zt2t this study aims to analyze the different clinical characteristics between children and their families infected with severe acute respiratory syndrome coronavirus 2. clinical data from nine children and their 14 families were collected, including general status, clinical, laboratory test, and imaging characteristics. all the children were detected positive result after their families onset. three children had fever (22.2%) or cough (11.2%) symptoms and six (66.7%) children had no symptom. among the 14 adult patients, the major symptoms included fever (57.1%), cough (35.7%), chest tightness/pain (21.4%), fatigue (21.4%) and sore throat (7.1%). nearly 70% of the patients had normal (71.4%) or decreased (28.6%) white blood cell counts, and 50% (7/14) had lymphocytopenia. there were 10 adults (71.4%) showed abnormal imaging. the main manifestations were pulmonary consolidation (70%), nodular shadow (50%), and ground glass opacity (50%). five discharged children were admitted again because their stool showed positive result in sars-cov-2 pcr. covid-19 in children is mainly caused by family transmission, and their symptoms are mild and prognosis is better than adult. however, their pcr result in stool showed longer time than their families. because of the mild or asymptomatic clinical process, it is difficult to recognize early for pediatrician and public health staff. in late 2019, an outbreak of pneumonia with unknown etiology was found in wuhan, hubei province, china. then the pathogen was isolated soon and named the 2019 novel coronavirus (2019-ncov) on 12 january 2020 [1] and on 11 february, the international committee on taxonomy of viruses announced that its official classification is severe acute respiratory syndrome coronavirus 2 (sars-cov-2). the virus spread very fast in wuhan. even more unfortunate, as the chinese spring festival is approaching, aggregation of large numbers of people flow caused it to spread quickly across the country and even spread to more than 100 countries [2] . the current case reports are mainly concentrated in hubei province and adults, but cases of children outside hubei province are rare. meanwhile, the clinical characteristics of cases in hubei province and other provinces were significantly different. here, we report the clinical manifestations, laboratory test results, imaging characteristics, and treatment regimen of nine sars-cov-2 infected children and their families in jinan, shandong province to increase awareness of this disease, especially in children. a retrospective review was conducted of the clinical, lab tests, and radiologic findings for nine children and their families admitted to the jinan infectious diseases hospital identified to be nucleic acid-positive for sars-cov-2 from 24 january 2020 to 24 february 2020. sample collection and pathogen identification after admission to the hospital, respiratory tract samples including sputum and nasopharyngeal swabs were collected from the patients, which were tested for influenza, avian influenza, respiratory syncytial virus, adenovirus, parainfluenza virus, mycoplasma pneumoniae and chlamydia, along with routine bacterial, fungal, and pathogenic microorganism tests. real-time pcr used the sars-cov-2 (orf1ab/n) nucleic acid detection kit (bio-germ, shanghai, china) and performed refer to previous literature [3] . all the patients were recorded with basic information and epidemiological histories [4] including (1) history of travel or residence in wuhan and surrounding areas or other reported cases within 14 days of onset; (2) history of contact with new coronavirus infection (nucleic acid-positive) 14 days before onset; (3) history of contact with patients with fever or respiratory symptoms from wuhan and surrounding areas, or from communities with case reports within 14 days before onset; (4) cluster onset, along with disease condition changes. laboratory test results were compiled, including standard blood counts, blood biochemistry, c-reactive protein (crp), procalcitonin (pct), erythrocyte sedimentation rate(esr), interleukin-6 (il-6) and myocardial enzyme spectrum. additional data collected included medical imaging, treatment regimens, and prognosis (any severe complications, including death), and recover or discharge date (table 1) . this study was conducted in accordance with the declaration of helsinki. informed consent was waived because of the retrospective nature of the study and the analysis used anonymous clinical data. continuous data are expressed as medians and ranges, and categorical data are presented as counts and percentages. there were three boys, six girls and their 14 families admitted to jinan infectious disease hospital of shandong university were investigated in this study. the youngest of the nine children was a pair of elevenmonth-old twins and the oldest is nine years and 9 months old (mean age was 4.5 years, median age 3.5 years, table 1 ). there were 16 families were infected by sars-cov-2, and 14 adults were enrolled in this study (two patients hospitalized in another hospital). the 14 patients consisted of 8 males and 6 females with a mean age of 42.9 years (median age, 37 years [range, 30-72 years]). all nine pediatric patients came from eight families. as shown in table 1 , six children had no information from the epidemiological data, 7/14(50%) of the adults were infected through household contact, 5 (35.8%) was found to be infected after returning from wuhan or hubei in late january 2019 and 2 (14.2%) patients couldn't find the exact source of infection. as shown in table 2 , 8/9 (88.9%) children had normal or decreased white blood cell counts, consistent with the main characteristic of viral infection. six children (66.7%) showed increased ck-mb. alt, ast and the other index of liver and kidney were all normal. all inflammation indicators, including crp, pct, esr and il-6 were all within the normal range. two children (22.2%) showed bronchitis and one (11.1%) showed bronchial pneumonia. one (11.1%) boy (the older of the twins) showed pulmonary consolidation and ground glass opacity on the first day ( figure 1 (a)) admitted in the hospital, and disappeared after five days (figure 1(b) ). five other (55.6%) children showed no abnormal chest radiograph. all the adult patients had normal (10/14, 71.4%) or decreased (4/14, 28.6%) white blood cell counts and 10 (71.4%) have lymphopenia. there were 4 (28.6%) patients had increased crp, pct, serum amyloid a (saa), d-dimer and il-6, meanwhile, their ct-scan showed larger lung consolidation. compared to children, there were only two (14.3%) patients showed increased ck-mb. ferritin in the adult patients were higher than the children but most of them were normal (11/14, 78.6%). the imaging of adult chest was mix and the most common characters of imaging were pulmonary consolidation (50%), nodular shadow (42.9%), and ground glass opacity (ggo, 35.7%) (figure 2) . four (28.6%) adults showed normal chest imaging. at present, there are no drugs available that can target sars-cov-2. therefore, treatment was focused on symptomatic and respiratory support. all the children inhaled interferon and one of the twins was prescribed ribavirin (10-15 mg/kg.d) in addition. ten (71.4%) adults with pneumonia were treated lopinaviritonavir (200/50 mg, 2 tablets, bid), interferon and chinese medicine. the patients with higher infection index (such as crp, pct, esr, saa, il-6) were prescript antibiotics for 5-7 days in addition. all the nine children and 14 adult patients recovered in 2-3 weeks and were discharged after two negative nucleic acid tests. unfortunately, our follow up found that there were five discharged children were admitted again before we submit this article because their stool showed positive result in sars-cov-2 pcr. meanwhile, all their families were negative in all the specimen. coronaviruses are a large family of viruses that are known to cause illness ranging from the common cold to more severe diseases. aa an enveloped rna virus, cov is ubiquitous in humans, other mammals, and birds, which can cause respiratory, digestive, liver and nervous system disorders [5, 6] . to date, six covs have been known to cause human infection [7] . among them, two zoonotic viruses, sars-cov and mers-cov, were responsible for serious outbreaks: in china in 2002-2003 [8, 9] of particular concern, our observations found that all the children were diagnosed after their families, which indicated that they were infected by the household contact. however, after an epidemiological investigation, we found that six adults (42.9%) had a definite or suspicious contact history and six families (42.9%) contacted them were infected, while the other two patients (14.3%) denied any epidemiological history. among them, the father of case 9 did not contact anyone who came back from wuhan or hubei, but also denied contact with any person with respiratory symptoms. at the same time, through official investigations, they did not find that someone was diagnosed with sars-cov-2 infection on the vehicle he was travelling on, prompting the virus to spread. in addition, from the official information, more and more patients can't find the clue of infection and more and more cluster outbreak showed that no contact, no close communication and even never go out the door. so, we think that these phenomena maybe suggest that: (1) the virus spreads very strongly and the transmission of the virus may not be limited to contact, droplets and airborne transmission, and aerosol transmission may also exist, which was similar to sars [11] . (2) the virus may be carried asymptomatically after infecting the human body but can infect other people. in china, the sars outbreak of 2003 is still impressive, because the 2002-2003 sars outbreak infected 8422 individuals leading to 916 deaths in eight affected areas [12] . during the sars outbreak, there were less children patients and the symptoms are significantly milder in children than in adults [13] [14] [15] [16] . similarly, the official data to date suggest that children infected with the sars-cov-2 are relatively rare too [17] , and their overall symptoms are significantly mild. the main reasons for this phenomenon may be: (1) the range of activities for children is relatively small, they are mainly infected by their adult families. and, as an rna virus, the sars-cov-2 virus maybe also is prone to mistakes in replication, mutating, and surviving without recognition by the immune system, but can also cause a decline in virulence. so, children are infected with second or third generation or even fourth generation virus and they get milder symptoms; (2) it may be because of differences in the immune responses of children compared to adults. one hypothesis is that the innate immune response, that is the early response that is aimed broadly at groups of pathogens, tends to be more active in children. the innate immune system is the first line of defense against pathogens. cells in that system respond immediately to foreign invaders. the adaptive immune system, by contrast, learns to recognize specific pathogens, but takes longer to join the battle. if the innate immune response is stronger in children exposed to sars-cov-2, they may fight off infection more readily than adults, suffering only mild symptoms. other coronaviruses, including sars and mers, also show this pattern [18] . (3) the number or function of ace2 receptors in children is not as good as in adults. recently, one studies had investigated the role of the ace2 receptor and found that the sars-cov-2 uses the sars-coronavirus receptor ace2 and the cellular protease tmprss2 for entry into target cells [19] . as we know, the distribution of ace2 receptors in different organs and populations is different. therefore, it may be that different receptor levels or functions in children and adults lead to different severity of illness. (4) other reasons: such as children have fewer basic diseases, children smoke less, and children have strong self-healing capabilities and so on. ck-mb is an indicator of myocardial injury. in the present study, we found six children and two adults had high ck-mb, which means that sars-cov-2 can cause heart injury. it is reported that the main mechanisms of sars-cov-2-induced myocardial injury may be the direct injury of virus, the inflammatory storm and the distribution of ace2 receptor [20] . as human lifestyles change, more and more viruses are spreading across species. current research confirms that sars-cov-2 are transmitted from animals to humans. like other viruses, the relationship between sars-covs and humans has the following possibilities: (1) the virus disappears for some unknown reasons, such as sars-cov. (2) viruses coexist with humans and have seasonal onsets, such as flu influenza viruses. the first is the best outcome of the current situation, but the second possibility is very large. if, as we analyzed above, many people, especially children with mild or no clinical symptoms carry the virus but do not develop the disease, however, the virus spread very strongly, it may lead to the silent spread of the disease and leading to major losses. therefore, the chinese government will face greater risks after school starts and work resumes. and, clinicians, especially pediatricians, need to be vigilant to prevent widespread spread of the disease. children who have infected family members should be monitored or evaluated and family clustering should be reported to ensure a timely diagnosis. in addition, just before we submit, we found that five of six discharged children returned to the hospital because of positive pcr in their stool, however, their families were all negative. one girl (case 3) didn't return to the hospital but isolated in home because she had mild mental symptoms after discharge. although positive results cannot confirm there were live virus in the stool or not. however, for insurance of public health, they were admitted to the hospital again to get clinical observation. interestingly, their onset was later than their families, but the period of positive pcr was longer than adults. we should pay more attention to this phenomenon and study the possible mechanism. several important limitations of this study should be noted. first, the size was small. second, the retrospective study included only of children who were hospitalized in one hospital. but as one of the rare reports in children out hubei province, it's helpful to improve the ability to recognize patients with mild illness. further studies with large multi-center samples are needed. in conclusion, by analyzing 23 confirmed cases of covid-2019 in jinan, shandong province, this study's findings indicate that new control measures should include rapid medical assessment and removal of the case from the home, as well as increased awareness of the importance of protective measures after symptom onset. public health measures such as home isolation should be aimed at minimizing such risk factors when addressing household transmission of serious infections spread through droplet transmission. geneva: world health organization who. coronavirus disease (covid-2019) situation reports clinical characteristics of novel coronavirus cases in tertiary hospitals in hubei province national health commission of people's republic of china e24e2faf65953b4ecd0df4.pdf?ich_args2=464-11172813 036679_88eae94af1a195e2d387e01ae83b27b9_10001 002_9c896c2fdec2f9d99f38518939a83798_c8745eab2a 416ddd81cb9150f1f76daf coronavirus pathogenesis fatal swine acute diarrhoea syndrome caused by an hku2-related coronavirus of bat origin epidemiology, genetic recombination, and pathogenesis of coronaviruses infectious diseases. battling sars on the frontlines epidemiology and cause of severe acute respiratory syndrome (sars) in people's republic of china isolation of a novel coronavirus from a man with pneumonia in saudi arabia clinical features and short-term outcomes of 144 patients with sars in the greater toronto area summary of probable sars cases with onset of illness from 1 severe acute respiratory syndrome in children: experience in a regional hospital in hong kong clinical presentations and outcome of severe acute respiratory syndrome in children new and emerging infectious diseases the novel coronavirus pneumonia emergency response epidemiology team. the epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19) in china the novel coronavirus 2019 (2019-ncov) uses the sarscoronavirus receptor ace2 and the cellular protease tmprss2 for entry into target cells heart injury signs are associated with higher and earlier mortality in coronavirus disease 2019 (covid-19) we thank all patients involved in the study. dr zhang and gai had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. no potential conflict of interest was reported by the author(s). this study was funded by the jinan science and technology bureau [grant number 201907032]. the funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. xiang ma http://orcid.org/0000-0001-6139-4355 key: cord-313700-enivzp1f authors: lio, chon fu; cheong, hou hon; lei, chin ion; lo, iek long; yao, lan; lam, chong; leong, iek hou title: the common personal behavior and preventive measures among 42 uninfected travelers from the hubei province, china during covid-19 outbreak: a cross-sectional survey in macao sar, china date: 2020-06-19 journal: peerj doi: 10.7717/peerj.9428 sha: doc_id: 313700 cord_uid: enivzp1f background: the novel coronavirus diseases 2019 (covid-19) caused over 1.7 million confirmed cases and cumulative mortality up to over 110,000 deaths worldwide as of 14 april 2020. a total of 57 macao citizens were obligated to stay in hubei province, china, where the highest covid-19 prevalence was noted in the country and a “lockdown” policy was implemented for outbreak control for more than one month. they were escorted from wuhan city to macao via a chartered airplane organized by macao sar government and received quarantine for 14 days with none of the individual being diagnosed with covid-19 by serial rna tests from the nasopharyngeal specimens and sera antibodies. it was crucial to identify common characteristics among these 57 uninfected individuals. methods: a questionnaire survey was conducted to extract information such as behavior, change of habits and preventive measures. results: a total of 42 effective questionnaires were analyzed after exclusion of 14 infants and children with age under fifteen as ineligible for the survey and missing of one questionnaire, with a response rate of 97.7% (42 out of 43). the proportion of female composed more than 70% of this group of returners. the main reason for visiting hubei in 88.1% of respondents was to visit relatives. over 88% of respondents did not participate in high-risk activities due to mobility restriction. all (100%) denied contact with suspected or confirmed covid-19 cases. comparison of personal hygiene habits before and during disease outbreak showed a significant increase in practice including wearing a mask when outdoor (16.7% and 95.2%, p < 0.001) and often wash hands with soap or liquid soap (85.7% and 100%, p = 0.031). the novel coronavirus diseases 2019 caused by the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) caused over 1.7 million confirmed cases and cumulative mortality up to over 110,000 deaths worldwide as of 14 april 2020 (world health organization (who), 2020b), provided with its early transmission dynamic of human-human transmission among close contacts . it is estimated in a model that covid-19 would have resulted in 7.0 billion infections and 40 million deaths globally in 2020 in the absence of any intervention (walker et al., 2020) . wuhan city, the capital of hubei province in china, became the first outbreak center of covid-19 since december 2019 (phelan, katz & gostin, 2020) . during the chinese new year holidays, chinese people have the traditional habit of traveling to their hometowns for a family reunion and gathering to celebrate the beginning of the lunar new year. hence, many people, including groups of macao citizens, were obligated to stay in hubei province after the announcement of "lockdown"/sanitary cordons by the local government on 23 january 2020, that is, 2 days before chinese new year. it was not until 7 march 2020 that the macao sar government escorted a special team to wuhan, china to pick up 57 macao citizens from 31 families, who stayed in 10 different cities in hubei province (macao sar government portal, 2020b) . covid-19 was ruled out in all of them afterwards. a cross-sectional survey was conducted to have in-depth questionnaire interview of these people who were all uninfected by sars-cov-2 in a high-risk area, hubei province, china. this study aims to identify the common grounds and personal behavior leading to a zero-infection rate among participants that might provide crucial hints on global covid-19 pandemic control. a citizen who presented with body temperature equal to or greater than 37.5 degree celsius in hubei province was not allowed for boarding. after arrival to macao, all 57 citizens were sent to public health clinical center for a 14-day quarantine. a total of three serial nasopharyngeal swabs were obtained on day 2, day 7 and day 13 for viral rna detection by real-time rt-pcr techniques, which were all negative (100%) (macao sar government portal, 2020a) . sera antibodies of sar-cov-2 were tested with all negative results (100%) on day 14 before citizens released from quarantine. all citizens did not complain any symptoms during quarantine period. a questionnaire was designed to obtain demographic information, activity in hubei province, contact history, personal health behaviors such as habit of handwashing, mask usage and home cleaning. participants aged 15 or over were eligible for this study. the questionnaire survey was delivered to the isolation ward and was implemented by self-administration. the written consents were collected as digital format. infants and children with age under fifteen were considered ineligible for this survey. this study was approved by the hospital medical ethical committee of centro hospitalar conde de são januário, macao sar, china. descriptive statistic was used to summarize demographic information, high-risk activities and common preventive measures via standard parameter such as percentage, mean and median. then we compared behavior changes before and during covid-19 outbreak using wilcoxon signed rank test in continuous variables or mcnemar test in dichotomized variables. the statistical significance level was determined at a = 0.05. the statistical analysis was conducted using r (version 3.5.2, r core team, 2018). a total of 42 effective questionnaires were analyzed in final after exclusion of 14 infants and children with age less than 15 years old and missing of one questionnaire (response rate: 97.7%). the demographic information was summarized in table 1 . the majority of the participants aged between 20 and 44 years old (52.4%) and had received secondary education or above (97.6%). the proportion of female composed more than 70% of this group of returners. the most common comorbid diseases were hypertension (7.1%), followed by diabetes mellitus (4.8%) and hepatitis (4.8%). more than half of the respondents were non-smokers (61.9%). the main reason for visit hubei is to visit relatives (88.1%). more than 85 percent of participants thought the most important reason of not getting covid-19 was to keep distance ("stay away") from the crowd and decrease cluster or gathering incidence, followed by good personal protective measures (73.8%). mobility and participation of high-risk activities were restricted for these participants in hubei province according to the emergency response policy and these were specified by these respondents (table 2) : 97.6% of them did not visit crowded places; 90.5% of them did not use any public transportation; 90.5% did not go to any supermarket. about three-quarters of respondents received daily supply at home via unified delivery. none of them visited or traveled to other provinces or cities (0%). all the participants (100%) denied any contact with suspected or confirmed covid-19 patients while 4.8% of the participants stated there was confirmed covid-19 cases in their local community. a further survey of comparison of personal preventive measures before and during disease outbreak showed increased alert and practice of personal protection and hygiene during the spread (table 3) , such as wearing a mask when outdoor (16.7% and 95.2%, p < 0.001), wearing a mask every time when contact or talk with people (10% and 95%, p < 0.001), often wash hands with soap/liquid soap (85.7% and 100%, p = 0.031), use of alcohol-based hand sanitizers or disinfected wipes as substitute if handwashing facility not available (71.4% and 95.2%, p = 0.006), cleaning clothes and personal belongings immediately once get back home (35.7% and 78.6%, p < 0.001), cleaning mobile phone regularly (43.9% and 65.9%, p = 0.012). only 11.9% of respondents attend meal gatherings regularly during the spread compared to 59.5% before (p < 0.001). the increase in personal measures is significant and may possibly reflect the effectiveness of public health interventions. the aims of this research was to investigate the reasons that contributed to the negativeness of covid-19 in this high-risk population in hubei province. on the one hand, good physical health could be one factor, as the majority of participants were below the age of 45 (61.9%), non-smokers (61%) and 85.7% had no underlying chronic diseases. however, further studies are needed to determine the exact effect of physical health on the risk of covid-19 infection. on the other hand, it was also important to stop the transmission chain via political measures or personal health behaviors. on 23 january 2020 (2 days before the chinese new year), the china government imposed a "lockdown" in wuhan and other cities in hubei to quarantine this center, which is commonly referred to as the "wuhan lockdown" (health-commission, 2020). all public transport, including buses, railways, flights and ferry services were suspended with all stations and airports closed. the residents of wuhan were not allowed to leave the city without permission which was unprecedented in public health history. besides, measures on social aspects including the ban on massive gatherings such as concerts or competitions, close of entertainment venues and public facilities, schools closure and mandatory orders of wearing masks in public areas, were applied to mitigate the outbreak by controlling the source of infection and block transmission routes (pan et al., 2020) . as a result, the respondents of our study reported the highly restricted mobility in wuhan, china. a total of 97.6% of them denied visiting crowded places which required high self-discipline and other public measures to cooperate. to achieve this level of mobility restriction, local authority organized a team of volunteers to facilitate the delivery of foods and other supply to each home quarantine family (chinanews.com, 2020) , 76.2% of participants received essential materials via this method that decreased the chance of outdoor activity and interaction with other. nonetheless, 85% of respondents said that table 2 high-risk activities and daily supply conditions among respondents during covid-19 outbreak in early 2020 in hubei, china. "staying away from crowds" was the major reason to be not infected. moreover, there were emerging evidence suggesting these "lockdown" measures had certain roles on decreasing covid-19 incidence (colbourn, 2020; gostin & wiley, 2020; klompas et al., 2020; phelan, katz & gostin, 2020 ; the lancet respiratory medicine, 2020). it was estimated that the wuhan travel ban delayed the epidemic progression by 3-5 days in mainland china, (chinazzi et al., 2020; tian et al., 2020) while reducing case importations to other countries by nearly 80% through mid-february (chinazzi et al., 2020) . furthermore, the rates of confirmed cases and the effective reproduction number (rt), that is, the mean number of secondary cases generated by a typical primary case at time t in a population, declined since 24 january 2020, and fell below 1.0 since 6 february 2020, in a recent investigation (pan et al., 2020) . although intensive physical distancing and "lockdown" could help "flattening the curve" on covid-19 and preventing the sharp upward demand of health system capacities, the consideration of social and economic effects of "lockdown" and knock-on effects on health such as mental health and interpersonal violence is necessary (parmet & sinha, 2020 ). yet, our data showed that over half of the participants (57.1%) felt "calm" during stay in hubei province, which was somehow counterintuitive. we hypothesized that the provision of sufficient logistic support to the isolated families by local authorities and clear information delivery to the public during a "lockdown" will help to ease the stress and minimize subsequent psychological impact (brooks et al., 2020) . therefore, local governments should be advised to create a comprehensive strategy and to prudentially evaluate the following concerns including racisms, adequate explanations to the public about the rationale and upside, logistic power and resources, and cultural factors which may hinder the compliance before implementing large-scale mobility restrictions (parmet & sinha, 2020) . the administration of "lockdown" could even lead to precarious situations that could heighten transmission in some countries if corresponding supports are not tailor-made and comprehensive based on their own economic and social conditions, such as workers may be packed in state-run shelter during india "lockdown" (pulla, 2020) . likewise, the announcement of closing the gambling industry during the first half of february in macau was accompanied with foreseen policies of financial and resources supply could be one of the references of administration of any kind of measures (macao sar government portal, 2020c) . additionally, the significant behavior changes among participants before and during outbreak consisted of more wearing a mask outdoor, wash hands more frequently, clean and disinfect home more frequently, and less meal gatherings. although the transmission of sars-cov-2 was commonly believed via droplet and contact, no evidence of wearing a surgical mask alone by healthy persons can prevent them from infection with respiratory viruses including covid-19 currently while inappropriate use/disposal may even increase risk (world health organization (who), 2020a). however, none of the participants in our study agreed that it was less important to wash hands after wearing masks, and all of them (100%) believed that the incidence of accidental touching the face or nose after wearing a mask would be reduced. the effectiveness of personal protective measures in preventing pandemic influenza transmission by meta-analysis showed a significant protective effect of hand hygiene but mixed results for mask use and thus wearing mask was suggested to be applied alongside with hands hygiene (saunders-hastings et al., 2017) . wearing mask might also act as a "symbolism" on increasing individual awareness of good hygiene practice (klompas et al., 2020) . however, the universal mask-wearing scheme in public should be emphasized on the concurrent hand hygiene practice and social distancing as a bundle, while the allocation and availability of resources should be taken into account first to ensure adequate protection for healthcare workers (emanuel et al., 2020) . there were some limitations in this study. the e.l.i.z.a kits used for antibody detection were qualitative and not able to provide titers information. although the sample size of this questionnaire was limited and recall bias was inevitable, its implication may indirectly reflect the effectiveness of public health interventions in wuhan, china, including sanitary cordon, traffic restriction, social distancing, home confinement, centralized quarantine and universal symptom survey. such interventions were aimed at preventing individuals from face-to-face interaction and preventing asymptomatic covid-19 patients from spreading the coronavirus within the community. the lack of infected citizens limits for further comparison of difference of measures or behavior and further studies are warranted to determine the effectiveness of each preventive measure on covid-19 at the individual level. moreover, some of the participants had stayed in their relative home where the cleaning duty was not their responsibilities. hence the question of home cleaning might partially reflect the attitudes from their relatives/friends. our findings were in line with common preventive measures advised by the world health organization. good personal hygiene and adequate preventive measures such as less gathering, frequent handwashing, in addition to wearing a mask outdoor, were common grounds among 42 uninfected participants during the stay in hubei province under covid-19 outbreak. furthermore, the success of the "lockdown" and self-quarantine policy in hubei province could contribute to the local authority's strong logistical provision and transparency of information about the policy's rationale in order to maintain better mental health and thus increase compliance and efficacy of preventive measures. iek hou leong conceived and designed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. the following information was supplied relating to ethical approvals (i.e., approving body and any reference numbers): the medical ethical committee of centro hospitalar conde de são januário, macau sar, china, granted ethical approval to carry out the study within its facilities. the following information was supplied regarding data availability: the raw measurements are available in the supplemental file. supplemental information for this article can be found online at http://dx.doi.org/10.7717/ peerj.9428#supplemental-information. the psychological impact of quarantine and how to reduce it: rapid review of the evidence unified purchase of daily supplies by regular assessment in wuhan community the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak covid-19: extending or relaxing distancing control measures fair allocation of scarce medical resources in the time of covid-19 governmental public health powers during the covid-19 pandemic: stay-at-home orders, business closures, and travel restrictions control strategies for novel coronavirus infections in wuhan city universal masking in hospitals in the covid-19 era early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia 57 macao residents return from hubei released from quarantine 57 macao residents return home safely from hubei the chief executive of macau sar announced: gambling industry will be suspended for half month and people should stay at home to avoid from covid-19 association of public health interventions with the epidemiology of the covid-19 outbreak in wuhan covid-19-the law and limits of quarantine the novel coronavirus originating in wuhan, china: challenges for global health governance covid-19: india imposes lockdown for 21 days and cases rise r: a language and environment for statistical computing. vienna: r foundation for statistical computing effectiveness of personal protective measures in reducing pandemic influenza transmission: a systematic review and meta-analysis covid-19: delay, mitigate, and communicate an investigation of transmission control measures during the first 50 days of the covid-19 epidemic in china the global impact of covid-19 and strategies for mitigation and suppression the-community-during-home-care-and-in-health-care-settings-in-the-context-of-the-novel covid-19) situation report-84 we thank dr. tan fong cheong and ms. hong lei lou for their assistance in data collection and coordination. the authors received no funding for this work. the authors declare that they have no competing interests. chon fu lio conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. hou hon cheong conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. chin ion lei conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.iek long lo conceived and designed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. lan yao conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. chong lam conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. key: cord-271980-8x5g8r7c authors: yao, ye; pan, jinhua; liu, zhixi; meng, xia; wang, weidong; kan, haidong; wang, weibing title: ambient nitrogen dioxide pollution and spread ability of covid-19 in chinese cities date: 2020-09-30 journal: ecotoxicol environ saf doi: 10.1016/j.ecoenv.2020.111421 sha: doc_id: 271980 cord_uid: 8x5g8r7c this study aims to explore the relationship between ambient no(2) levels and the transmission ability (basic reproductive number, r(0)) of covid-19 in 63 chinese cities. after adjustment for temperature and relative humidity, r(0) was positively associated with no(2) concentration at city level. the temporal analysis within hubei province indicated that all the 11 hubei cities (except xianning city) had significant positive correlations between no(2) concentration (with 12-day time lag) and r(0) (r>0.51, p<0.005). since the association between ambient no(2) and r(0) indicated no(2) may increase underlying risk of infection in the transmission process of covid-19. in addition, no(2) is also an indicator of traffic-related air pollution, the association between no(2) and covid-19’s spread ability suggest that reduced population movement may have reduced the spread of the sars-cov-2. the covid-19 pandemic has highlighted the importance of international solidarity and unity in the face of a dire global health and economic crisis. the pandemic, which was first reported in december 2019 in wuhan, china, has caused 6,757,764 confirmed cases worldwide as of jul 31, 2020, with 88,077 cases reported in china (nhc, 2020) . although massive intervention measures (e.g., shutting down cities, extending holidays, and travel bans) have been implemented in china and many other countries, the spread of the disease is unlikely to be stopped worldwide in the near future. no effective vaccines or antiviral drugs have been clinically approved so far. our current understanding of the factors that impact sars-cov-2 transmission is still limited. environmental factors are associated with the seasonality of respiratory-borne disease j o u r n a l p r e -p r o o f epidemics (sooryanarain and elankumaran, 2015) . some research has investigated both indoor and outdoor environmental nitrogen dioxide (no 2 ) pollution exposure to individuals (salonen et al., 2019) . previous cross-sectional and cohort research has provided evidence that ambient no 2 exposure had longitudinal effects on growth in lung function (molter et al., 2013) , causing pulmonary insufficiency (e.g., lung volume, expiratory flow). in addition, previous studies have suggested that ambient no 2 exposure may play a role in the phenotypes of respiratory diseases including but not limited to influenza (huang et al., 2016) , asthma (weinmayr et al., 2010) , and severe acute respiratory syndrome (kan et al., 2005) . for example, no 2 might increase adults' susceptibility to viral infections (goings et al., 1989) . exposure to high levels of no 2 before the start of a respiratory viral infection is associated with the severity of asthma exacerbation (chauhan et al., 2003) . recently, a european study found that 78% of covid-19 fatalities were located in five regions that showed the highest concentrations of no 2 (ogen, 2020) . this finding indicates that long-term no 2 exposure may be an important risk factor for covid-19 fatality. however, contini et al.(contini and costabile, 2020) discussed the relationships between atmospheric parameters and covid-19 prevalence or fatality are influenced by several confounding factors, which made difficult to interpret correlations that are not indicating necessarily a cause-effect relationship in the description study. although it's an inevitable limitation in our description study, our study aims to thoroughly explore the influence of no 2 on covid-19 transmission and to try to acquire more solid results with potential confounders adjusted. in this study, we aim to assess the associations between ambient no 2 levels and the spread ability of covid-19 across 63 chinese cities, and we provide information to facilitate the further prevention and control of covid-19. j o u r n a l p r e -p r o o f 3 methods we collected covid-19 confirmed case information reported by the national health commission of the people's republic of china(who, 2020) and health commission of hubei province (http://wjw.hubei.gov.cn/bmdt/ztzl/fkxxgzbdgrfyyq/). guidelines on the diagnosis and treatment of patients were defined according to the fourth version of the guidelines (issued on january 27, 2020). the clinical criteria for diagnosis were to meet any two of the three remaining clinical criteria (i.e., fever, radiographic findings of pneumonia, and normal or reduced white blood cell count or reduced lymphocyte count in the early stage of illness). an epidemiological criterion was added (e.g., linkage with a confirmed covid-19 case) (nhc, 2020; zhang et al., 2020) . the population movement in cities outside hubei from the same period was obtained from baidu qianxi data (https://qianxi.baidu.com/2020/), and we used migration index and travel intensity to describe the movement. we obtained hourly concentrations of various air pollutants, including sulfur dioxide (so 2 ), no 2 , carbon monoxide (co), ozone (o 3 ), fine particulate matter (pm 2.5 ), and inhalable particulate matter (pm 10 ). these data came from 63 cities (cities in china with more than 50 confirmed covid-19 cases as of february 10, 2020) and ranged from january 1, 2020 to february 8, 2020. the data were acquired from the national urban air quality the reproductive number (r 0 ), the average number of individuals infected by an initial infectious individual in a completely susceptible population, is fundamental to understanding disease transmission. we calculated r 0 for 63 chinese cities with more than 50 cases as of february 10, 2020 (the covid-19 peak period in china), including 12 and 51 cities inside and outside hubei, respectively. we used the method introduced by aaron et al. to estimate r 0 (aaron a. king et al., 2017) . first, we constructed a linear regression model to estimate the relevant coefficient. second, we obtained r 0 by combining the coefficients obtained from the previous step with the average incubation and confirmation periods. we assigned the average values of the incubation period and the mean course from case infection to confirmation as 7 and 3.8 days, respectively. these values were obtained in previous mathematical research (pan et al., 2020) . all calculations were completed in r software version 3.6.1 (r foundation for statistical computing). mediation is a hypothesized causal chain in which one variable affects a second variable that, in turn, affects a third variable (lederer et al., 2019) . the relationship between no 2 concentration and r 0 of covid-19 may be mediated by population density or other air pollutants, such as city population and city area. those mediators may indirectly affect the r 0 value of covid-19 by modulating the no 2 concentration, thus affecting the spread of covid-19. in this study, we used mediation analysis to explore whether these factors were j o u r n a l p r e -p r o o f mediators of the relationship between no 2 and r 0 of covid-19, and we used bootstrapping to estimate standard error while testing the significance of these mediating effects. we conducted a cross-sectional analysis to examine the associations of no 2 with r 0 of covid-19. we also conducted a longitudinal analysis to examine the temporal associations (with daily data points) of no 2 with r 0 in cities inside hubei province since the date when they had enough confirmed cases to acquire stable daily r 0 values. the other covariates, including health policies, were quite similar throughout hubei province. when examining the correlation between no 2 and r 0 of covid-19, we estimated the associations of no 2 concentration with r 0 both inside and outside hubei province (r & p) in the same period by using multiple linear regression models after controlling for temperature and relative humidity (as covariates in the regression model) separately. then, we used meta-analysis to pool the estimates of the specific associations of no 2 concentration with r 0 (meta χ 2 & p). we also examined the corresponding temporal associations between no 2 and r 0 of covid-19 across the different cities inside and outside hubei province using multiple linear regression models after controlling for temperature and relative humidity separately. the change of r 0 per 10 μg/m 3 increase in no 2 pollution was calculated. given that associations between no 2 and covid-19 prevalence are influenced by several confounding factors, we further examined the associations of no 2 with the r 0 of covid-19 with adjustment for density of population, gdp per capita and hospital beds per capita in the main model. among the 63 investigated cities, the mean±standard deviation and range of no 2 concentration and r 0 were (27.9±8.3 ug/m 3 , 10.7-53.0 ug/m 3 ) and (1.4±0.3, 0.6-2.5), respectively. the cities with the three highest r 0 values were wuhan, huanggang, and yichang, which are all in hubei province. the similarity of the spatial distributions between r 0 and no 2 suggests a relationship between r 0 and no 2 concentration (figure 1) . no matter hubei province or outside of hubei province, the daily concentration trend of no 2 from january to march in 2016-2019 is almost the same, but it is obvious that the daily concentration of no 2 in 2020 is lower than that in other years, especially after january 23, 2020 (figure 2) , which may be due to the closure of wuhan city in hubei. j o u r n a l p r e -p r o o f the scatter diagram of r 0 and no 2 distributions (figure 3) shows that r 0 tends to increase with no 2 concentration, suggesting a positive correlation between r 0 and no 2 concentration. the cross-sectional analysis indicates that, after adjustment for temperature and relative humidity, r 0 was positively associated with no 2 concentration at city level (meta χ 2 =10.18, j o u r n a l p r e -p r o o f p=0.037) (figure 3) . additionally, we further examined the associations of no 2 with the r 0 of covid-19 adjusted for density of population, gdp per capita, hospital beds per capita separately in the main model, and we found that none of the three covariate would affect the significant positive association between no 2 with r 0 . in the following stratified analysis, a significant association was confirmed in cities outside hubei (r=0.29, p=0.046), whereas the trend observed in cities inside hubei was not significant (r=0.51, p=0.130) (figure 3) . for every 10 μg/m 3 increase in no 2 , r 0 increased by 0.12 (0.01-0.23) and 0.52 (−0.20 to 1.25), respectively. we did not find significant associations of temperature or relative humidity with r 0 of covid-19 (meta χ 2 =4.62, p=0.370 and meta χ 2 =1.63, p=0.800, respectively). the basic reproductive number r 0 was positively associated with no 2 (meta χ 2 =10.18, p=0.037) in cities outside hubei (blue points, 51 cities, r=0.29, p=0.046, solid line) and cities inside hubei (green points, 12 cities, r=0.51, p=0.13, dashed line). we controlled the effects from temperature and relative humidity in the multiple linear regression models. in addition, we found that r 0 was positively associated with the average no 2 value from 2016-2019 (meta χ 2 =13.74, p=0.008; figure 4a ) with adjustment for temperature and relative humidity. because the average no 2 value from 2016-2019 was significantly j o u r n a l p r e -p r o o f associated with that in early 2020 (r=0.85, p<0.0001), it is difficult to determine which factor is dominant in covid-19 transmission. moreover, the other investigated air pollutants (so 2 , co, o 3 , pm 2.5 , and pm 10 ) had no significant associations with r 0 (meta χ 2 <9.09, p>0.06; figure 4b-f). furthermore, in order to avoid potential population movement effects in our study, which could decrease both no 2 and r 0 , we collected reduced population movement data from 51 cities outside hubei in the same period. we re-calculated no 2 -r 0 associations including the population movement as a covariate, and we found that the no 2 was still significantly correlated with r 0 of covid-19 outside hubei (r=0.32, p=0.024) (a) the basic reproductive number r 0 was positively associated (meta χ 2 =13.74, p=0.0082) with the average no 2 value from 2016-2019. (b)-(f) there were no significant associations between other air pollutants (so 2 , co, o 3 , pm 2.5 , and pm 10 ) and r 0 (meta χ 2 <9.09, p>0.06). we controlled the effects from temperature and relative humidity in the multiple linear regression models. we calculated the daily r 0 values of 11 cities in hubei (except wuhan) from january 27 to february 26, 2020 (there were few covid-19 confirmed cases in these cities afterwards) and normalized them based on wuhan's daily r 0 value to eliminate the effects of other covariates. we found that 11 hubei cities (except xianning city) had significantly positive correlations between no 2 concentration (with 12-day time lag) and r 0 (r>0.51, p<0.005), suggesting a positive association between daily no 2 concentration and covid-19 spread ability on the temporal scale (figure 5) . the same conclusion was reached for other time lag settings, but the most significant value was obtained with a delay of 12 days. the results of residual analysis and principal component analysis were shown in figure s1 and figure s2 , respectively. temporal correlation between no 2 concentration and r 0 in 11 cities in hubei. except for xianning, all of those cities had significant positive correlations (r>0.51, p<0.005) between no 2 (with 12-day time lag) and daily r 0 (normalized based on wuhan's daily r 0 ). to eliminate the effects of city population and city area on the relationship between no 2 concentration and r 0 value, we applied a mediation analysis to verify whether more densely populated cities had both greater r 0 and no 2 concentration values. after adjustment for temperature and relative humidity, the mediation analysis found insignificant direct and indirect effects of city population and city area on r 0 (z=−1. 43, p=0.15 & z=−0.24, p=0.800 and z=−0.46, p=0.650 & z=1.15, p=0.250, respectively) . thus, there were no apparent mediation effects between city population, city area, no 2 , and r 0 . city population and city area did not influence the association between no 2 concentration and r 0 . this study explored the association between environmental factors and covid-19 transmission. to our knowledge, little research has been done on the relationship between ambient air pollution and covid-19 transmission. our results show a significant association between no 2 exposure and r 0 , suggesting that ambient no 2 may contribute to the spread ability of covid-19. to prevent city population and city area from affecting the relationship between no 2 concentration and r 0 level, we applied a mediation analysis to verify whether more densely populated cities have both greater r 0 values and higher no 2 concentrations. the results showed that city population and city area did not influence the association between no 2 concentration and r 0 level. although the closures of cities throughout hubei occurred at approximately the same time point: the other cities of hubei were locked down no longer than 1-2 days later than wuhan city, the effect of the lockdown measure in different cities (e.g. cities with busy traffic vs. small rural cities) was not expected to have the same influence on the association between j o u r n a l p r e -p r o o f no 2 and covid-19 transmission. multiple impact factors (the population density of the city, the typical road traffic and commercial exchanges, etc.) may still have confounded the association in the current analysis, but we have controlled for as many factors as possible to reduce confounding and solid our results, including the density of population, gdp per capita and hospital beds per capita. previous studies also have suggested that the increased spread ability resulting from no 2 exposure might be caused by the effects of no 2 on host defenses that prevent viral spread (becker and soukup, 1999) . tm chen et al. (chen et al., 2007) found that exposure to no 2 may harm to humans' health by interacting with the immune system; besides, ic mills et al. (mills et al., 2015) observed that short-term exposure to no 2 had increased the hospital admission rates for a range of respiratory diseases in different age groups. therefore, we speculated no 2 have potential ability to contribute in the infection process of covid-19 directly. in addition, no 2 emissions primarily come from burning fossil fuels (diesel, gasoline, coal), resulting in automobile and smokestack exhaust, the latter of which can be produced by electricity generation. therefore, changes in no 2 levels can be used to indicate changes in human activity and population movement due to the lockdown of cities. for example, we can see that since january 23, 2020, the daily average concentration of no 2 after the closure of wuhan is obviously lower than that of the same period in previous years ( figure 2 ). besides, it is well known that the spread of respiratory virus is through contact (direct or indirect via fomites) or through contaminated droplets emitted by cough, sneeze, respiration and speaking of infected individuals, both of which are related with human contact, social distance and population movement. plus, no 2 is as an indicator of traffic-related air pollution, the association between no 2 and r 0 of covid-19 may be explained by the relationship between viral spread and population movement. of course, further investigations are warranted to provide additional details and illustrate this mechanism. j o u r n a l p r e -p r o o f our study has some limitations: first, the averaging of no 2 concentrations across cities likely resulted in an unknown degree of exposure misclassification, given the spatial variability and traffic-dependence of no 2 and the potential for indoor exposure. second, r 0 could be highly variable and is influenced by a variety of factors, including not only the previously mentioned mitigation efforts but also the comprehensiveness of case identification. third, for the lack of corresponding data of no, we did not explore the association between primary pollutant no and the transmission ability of covid-19. given the ecological nature of this study, other city-level factors, such as the implementation ability of covid-19 control policy, urbanization rate, and availability of medical resources, may affect the transmissibility of covid-19 and confound our findings. future studies should develop individual-based models with high spatial and temporal resolution to assess the correlations between air pollution and the epidemiologic characteristics of covid-19. the mechanisms between no 2 and the transmission of covid-19 disease still require further research, besides, the spread of covid-19 could be affected by many factors. we also believe that there is likely to have interaction of environmental factors and npis, which deserves further analysis. introduction to model parameter estimation effect of nitrogen dioxide on respiratory viral infection in airway epithelial cells personal exposure to nitrogen dioxide (no2) and the severity of virus-induced asthma in children outdoor air pollution: nitrogen dioxide, sulfur dioxide, and carbon monoxide health effects effect of nitrogen dioxide exposure on susceptibility to influenza a virus infection in healthy adults acute effects of air pollution on influenza-like illness in nanjing, china: a population-based study relationship between ambient air pollution and daily mortality of sars in beijing control of confounding and reporting of results in causal inference studies. guidance for authors from editors of respiratory, sleep, and critical care journals quantitative systematic review of the associations between short-term exposure to nitrogen dioxide and mortality and hospital admissions long-term exposure to pm10 and no2 in association with lung volume and airway resistance in the maas birth cohort national health commission of the people's republic of china.diagnosis and treatment guideline on pneumonia infection with 2019 novel coronavirus assessing nitrogen dioxide (no2) levels as a contributing factor to coronavirus (covid-19) fatality effectiveness of control strategies for coronavirus disease 2019: a seir dynamic modeling study human exposure to no2 in school and office indoor environments environmental role in influenza virus outbreaks short-term effects of pm10 and no2 on respiratory health among children with asthma or asthma-like symptoms: a systematic review and meta-analysis coronavirus disease (covid-19) situation report -121 evolving epidemiology and transmission dynamics of coronavirus disease 2019 outside hubei province, china: a descriptive and modelling study the funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing the report. the corresponding author had full access to all of the study's data and takes final responsibility for the decision to submit for publication. the authors declare no competing interests. ☒the authors declare the following financial interests/personal relationships which may be considered as potential competing interests: none key: cord-325012-yjay3t38 authors: chen, ze-liang; zhang, qi; lu, yi; guo, zhong-min; zhang, xi; zhang, wen-jun; guo, cheng; liao, cong-hui; li, qian-lin; han, xiao-hu; lu, jia-hai title: distribution of the covid-19 epidemic and correlation with population emigration from wuhan, china date: 2020-02-28 journal: chin med j (engl) doi: 10.1097/cm9.0000000000000782 sha: doc_id: 325012 cord_uid: yjay3t38 background: the ongoing new coronavirus pneumonia (corona virus disease 2019, covid-19) outbreak is spreading in china, but it has not yet reached its peak. five million people emigrated from wuhan before lockdown, potentially representing a source of virus infection. determining case distribution and its correlation with population emigration from wuhan in the early stage of the epidemic is of great importance for early warning and for the prevention of future outbreaks. methods: the official case report on the covid-19 epidemic was collected as of january 30, 2020. time and location information on covid-19 cases was extracted and analyzed using arcgis and winbugs software. data on population migration from wuhan city and hubei province were extracted from baidu qianxi, and their correlation with the number of cases was analyzed. results: the covid-19 confirmed and death cases in hubei province accounted for 59.91% (5806/9692) and 95.77% (204/213) of the total cases in china, respectively. hot spot provinces included sichuan and yunnan, which are adjacent to hubei. the time risk of hubei province on the following day was 1.960 times that on the previous day. the number of cases in some cities was relatively low, but the time risk appeared to be continuously rising. the correlation coefficient between the provincial number of cases and emigration from wuhan was up to 0.943. the lockdown of 17 cities in hubei province and the implementation of nationwide control measures efficiently prevented an exponential growth in the number of cases. conclusions: the population that emigrated from wuhan was the main infection source in other cities and provinces. some cities with a low number of cases showed a rapid increase in case load. owing to the upcoming spring festival return wave, understanding the risk trends in different regions is crucial to ensure preparedness at both the individual and organization levels and to prevent new outbreaks. emerging infectious diseases are a major challenge in the 21st century. in recent years, worldwide outbreaks of ebola and middle east respiratory syndrome caused great health and economic losses. [1, 2] the ongoing new coronavirus pneumonia (corona virus disease 2019, covid-19) outbreak is becoming a global public health problem. the covid-19 outbreak is highly similar to the severe acute respiratory syndrome (sars) outbreak that occurred in 2003; both outbreaks were caused by new coronaviruses during time periods overlapping with the chinese spring festival. [3] on december 31, 2019, the wuhan municipal health committee reported 27 cases of pneumonia with an unknown cause, and many cases were traced to the wuhan southern china seafood market, which was subsequently closed on january 1, 2020. [4] on january 7, 2020, laboratory tests showed that the pathogen causing the previously unexplained pneumonia was a new type of coronavirus; this pneumonia was then officially named covid-19 by the world health organization. [5, 6] the covid-19 outbreak started in wuhan and spread rapidly to other provinces and countries. [7, 8] as of january 30, 2020, a total of 34 provinces and regions in china had reported 9692 cases, and nearly all imported cases were derived from wuhan in hubei province. [9, 10] covid-19 has been defined as a class b infectious disease but has been managed as a class a infectious disease by the chinese government. daily case reports are being released, and any omission or concealment is punishable by law. currently, the number of cases is still increasing, and the epidemic has not yet reached its peak; however, the situation differs from province to province. information on the temporal and spatial distributions of cases is important for developing targeted treatment and prevention strategies. because the return peak of spring festival travel is approaching, information on the possible changes in the incidence of covid-19 in different cities will help in better preparation for disease prevention and management. therefore, in this study, we investigated the temporal and spatial distributions of the early covid-19 epidemic to reveal the dynamic changes and trends in reported cases. these results will provide valuable information for disease prevention at both the individual and organization levels. all officially reported confirmed and suspected cases of covid-19 and related deaths were collected from the official website of health departments or articles citing their reports. case data were imported into microsoft excel (microsoft corporation, redmond, wa, usa) and analyzed. the national and hubei province shapefiles were used for arcgis (environmental systems research institute, redlands, wa, usa) analysis. the map was linked to an excel file containing time and location information. location data were available for 34 provinces of china and 17 prefecture level cities of hubei province. the time span was from january 16 to january 30, 2020. the covid-19 risk analysis was based on the bayesian space-time model of the winbugs (microsoft corporation) software. [11, 12] the model was divided into three levels: (1) data model the statistical data on low incidence were assumed to follow a poisson distribution for the parameters n i and m it : y it ∼ poisson (n i m it ), where the hubei province y it was i (1, ..., 17) cities with t (1, ..., 15) days number of cases occurring during the day, and the nationwide y it was the number of cases occurring in t (1, ..., 15) days in i (1, ..., 34) provinces. we assumed that there was no change in the number of people at risk in each city during the study period, such that n i was the number of people at risk in the town (i), and m it was the corresponding disease risk in the city (t) per day (i). (2) process model m it 's logarithmic transformation of disease risk allows the relative risk to be expressed as a linear combination of spatial, temporal, and spatiotemporal interaction components. the mathematical expression is shown in equation (1). where a is the fixed effect of the overall relative risk in the entire study area within 11 days, and t * = t -5.5 is the time span relative to the intermediate time point. in this model, the risk of disease is broken down into three parts: spatial change, temporal change, and space-time interaction; s i is a component of spatial variability, describing the urban disease risk relative to the risk in the entire study region over an 11-day observation period; b 0t * + y t is the change over time, which represents the overall trend of disease risk in the entire study area relative to that on the medium-term observation day, including the linear trend b 0t * and the time random effect y t ; b 0 is the time coefficient, representing the time trend in the study area; and b 1i t * allows each city to have different time-varying trends and is part of the spatiotemporal interaction. relative to b 0 , it represents the trend of local change in each city based on b 0 ; e it is used to explain local changes that cannot be explained by spatiotemporal random effects. [13] (3) parametric model according to the besag york and molliè (bym) model, [14] a spatial structure effect is defined by a prior conditional autoregressive (car) structure. in this process, a spatial adjacency weight matrix needs to be defined. if adjacent, the weight w ij = 1; otherwise the weight w ij = 0, and the special w ij = 0. similarly, b 1i is also assumed to follow bym characteristics. for the time structure effect y t , a car process is used, and the adjacency weight matrix in time is defined. for the over-discrete parameter e it , according to gelman, the normal distribution with a mean value of 0 and a variance of s 2 e, is generally assumed and the variance of each parameter obeys gamma (a, b). [15] based on this model, through the spatial component s i and its posterior probability, high-or low-risk cities (identified based on the average risk [a] in the entire study area) can be identified. by calculating the probability that spatial relative risk exp(s i ) is greater than 1, regions can be divided into five categories: those with probability >0.8, 0.6-0.8, 0.4-0.6, 0.2-0.4, and <0.2 are defined as hot spots, secondary hot spots, warm-spots, subcold spots, and cold-spots, respectively. similarly, based on the probability threshold, the differences in these regions can be identified considering the trend over time. further, based on the probability that exp (b 1i ) is greater than 1, regions can be divided into five categories: cities with an incidence risk probability greater than 0.8 show a trend for a rapid change in risk relative to the overall change, and those with an incidence risk probability between 0.6 and 0.8 show a trend for a greater change in the incidence risk than the overall change. a value between 0.4 and 0.6 indicates that the change in the occurrence risk is the same as the overall risk change; 0.2 to 0.4, that the trend of change in disease risk is lower than the overall risk change; and less than 0.2, that the trend of change in disease risk is much lower than the overall risk. population migration data were collected from the baidu website (http://qianxi.baidu.com/). data on emigration from wuhan city and hubei province to other cities and provinces were extracted and edited with microsoft excel for windows (microsoft corporation). emigration intensity was calculated using the migration index multiplied by the migration proportion in the province or city. correlation analysis was performed using ibm spss statistics software (version 22; international business machines corporation, armonk, ny, usa). p values less than 0.05 were considered statistically significant. pearson correlation coefficients greater than 0.2 were considered indicators of a positive correlation. to obtain a general profile of the case distribution, we first analyzed all the available cases during this covid-19 outbreak. [16] as shown in figure 1a , the number of cases remained stable from january 11 to 15, 2020, and the number of new and cumulative cases increased rapidly after january 16. the first death was reported on january 10, and the number of deaths began to increase rapidly from january 17 onwards, with the cumulative number of deaths reaching 213 on january 30 [ figure 1b ]. [6] after the nucleic acid assay became available, suspected cases waiting for laboratory confirmation could be diagnosed rapidly. [17] after january 19, the number of suspected cases increased rapidly, and about 40% to 50% of these suspected cases were then confirmed [ figure 1c ]. before january 19, the number of severe cases remained low, but they increased steadily from january 20 onwards [ figure 1d ]. because wuhan is the capital city of hubei province and the virus spread throughout the province quickly, we also analyzed the changes in number of cases in hubei province. on january 9, 41 cases were first reported, and by january 30, 5806 cases had been reported, accounting for 59.91% (5806/9692) of the total cases in china [ figure 1e ]. the cumulative number of figure 1f ]. these data indicated that both the incidence and mortality of covid-19 disease were the highest in hubei province. [18] before january 16, cases were mainly reported in hubei province. from january 17 onwards, the outbreak spread to many provinces and the number of cases increased rapidly. therefore, our spatial and temporal analyses used data from january 17 to 30, 2020. the location of each case was extracted from official reports and mapped onto the national map at the city level using arcgis. of the 362 cities, 307 (84.8%, 307/362) had reported cases. in general, the core outbreak area, wuhan, and its surrounding cities had the highest number of cases, followed by cities with a high population which are transportation hubs. spatial distribution was then analyzed with a bayesian model using winbugs. after nearly 100,000 iterations, the model converged successfully. after the model converged, it was iterated another 110,000 times to obtain parameter estimations. generally, a ratio close to 1 indicates that the two chain iterative sequences are close, and that the model has a good convergence and is stable [ figure 2a ]. using the established model and parameters, hot and cold spots were identified. the results showed that sichuan, yunnan, guizhou, hainan, and taiwan were hot spots, and inner mongolia, gansu, ningxia, qinghai, xinjiang, chongqing, hunan, and guangxi were secondary hot spots. generally, hot spots clustered in the midwest, and cold spots clustered in the southeast [ figure 2b ]. the overall temporal trend was calculated using the time risk model (exp(b 0t * + v t )), which described the general incidence risk according to time between january 16 and 30, 2020. through the analysis, b 0 was estimated to be 0.4604, that is, the disease risk on the following day was found to be approximately 1.585 times higher than that on the previous day. the relative risk according to time increased steadily from january 20 onwards and the upward trend continued as of january 30 [ figure 2c ], indicating that the number of cases nationwide is on the rise. as shown in figure 2d , heilongjiang, hebei, beijing, tianjin, xinjiang, ningxia, jiangsu, hunan, taiwan, and hainan showed a faster increase in the number of cases than was observed overall in the country. the increase in the number of cases in jilin, liaoning, shaanxi, guangxi, and fujian provinces also occurred relatively fast [supplementary table 1 , http://links.lww.com/cm9/a210]. the increase in other provinces was consistent with or lower than the overall national trend [ figure 2d ]. since hubei province had the highest number of cases, we analyzed the temporal and spatial distribution in different cities of hubei province. wuhan had the highest number of cases, followed by huanggang and xiaogan cities. suizhou, jingmen, and xianning were part of the second group with a high number of cases. the spatial convergence analysis had 100,000 iterations [ figure 3a ]. hot spots were identified in the east regions and cold spots were identified in the west regions [ figure 3b ]. the overall temporal trend in the change in the number of cases was calculated using the model. the average time trend coefficient b 0 was estimated to be 0.6727, indicating the time risk (occurrence probability in time) on the following day was 1.960 times higher than that on the previous day, suggesting that the daily number of cases in hubei province is on the rise [ figure 3c ]. xiangyang, suizhou, yichang, and ezhou showed the highest increase rates, and shiyan, shennongjia, xiaogan, and huangshi showed relatively high increase rates [ figure 3d ]. other cities had a growth slower than the overall growth in the province [supplementary table 2 , http://links.lww.com/cm9/ a210]. the increase rate in hubei province (1.960) was higher than that in the whole country (1.585), indicating that the rate of increase in hubei province was significantly higher than that in other provinces in china. the outbreak started from wuhan, and nearly all early cases were derived from this city, which is located in hubei province. because the outbreak occurred just before the spring festival, large-scale population migration during this period influenced the subsequent epidemic. from january 1 to 23, 2020, the population that migrated out of wuhan city and hubei province increased steadily, peaking on january 21 and 22 [ figure 4a ]. wuhan city was under lockdown on january 23, and after that, population migration was greatly inhibited. as observed in 2019, high population migration occurred on january 31; the timely city lockdown prevented a subsequent outbreak burst. we analyzed the migration into and out of wuhan city and hubei province. the top targets for emigration included henan and hunan provinces [ figure 4b ]. more people migrated out of wuhan than into the city [ figure 4c ]. to analyze the correlation between the number of cases and the emigration in wuhan city and hubei province, population migration data were collected from baidu qianxi. the correlation coefficient between the provincial number of cases and emigration from hubei province was 0.719 [ figure 4d ]. the correlation coefficient between the provincial number of cases and emigration from wuhan increased to 0.943, with the highest coefficient of 0.996 observed between wuhan and other cities of hubei provinces [ figure 4e and 4f; supplementary tables 3 and 4 , http://links.lww.com/ cm9/a210]. these data strongly indicated that the number of cases was highly related to population emigration from wuhan. although we do not know the exact number of people emigrating from wuhan, 5 million is an astonishing number, considering that each individual may be a potential virus carrier. if no control measures were implemented, the number of cases would exponentially increase. of the 5 million emigrants, 74.22% emigrated to other cities of hubei province [supplementary because the outbreak duration overlaps with the spring festival transport waves, large-scale migration will be a strong determinant of the characteristics of this outbreak. we analyzed the migration in the 3 days before the spring cities with high immigration were relatively scattered. chongqing experienced the highest immigration, accounting for 1.50% of the total number of immigrants [ figure 4 ]. as immigrants will be traveling back to work after the spring festival, the cities showing high "emigration" may be at a high risk of another wave of new cases owing to the return of the migrants. covid-19 is causing great public health and economic losses in china. the number of cases has increased rapidly, with over 70% coming from hubei province. [16, 19] as of january 30, the number of cases has exceeded the total number of cases of the sars-cov outbreak. [20] until february 15, 2020, the cumulative number of confirmed cases was 70,533, nearly ten times that noted during the sars outbreak. prevention and control of the outbreak has required concerted action from the whole population of china. although all individuals have participated in the campaign against the outbreak, people in areas with a low number of cases assumed that they were safe from the disease. therefore, awareness of high-risk regions is important for preparing individuals, particularly in regions with low incidence. further, it must be noted that 5 million persons emigrated from wuhan to all over the country. [21] we do not know exactly how many of them are virus carriers, and it is impossible to track and diagnose them all. evidence from previous cases showed that asymptomatic patients in the incubation period are also infectious, making it a greater challenge to track virus carriers. therefore, isolation at home and less contact with others is the most efficient measure to prevent infection and transmission. to reduce transmission, the spring festival holiday has been extended from january 31 to february 2. the opening time for all schools and universities has been delayed, and online teaching programs have been launched. factories have been required to delay resumption or allow work from home. we analyzed the temporal and spatial distribution of reported cases. in general, the number of cases is still on the rise. for hubei province, which has the highest number of cases and deaths, the growth trend is relatively stable. conversely, in other hot spots, the number of cases was not very high, but the growth continued. hence, these areas should be closely monitored. [22] it is particularly noteworthy that the cities with the fastest change in temporal risk, such as chongqing, have large population movements and rapid temporal risk. if they are not strictly monitored, there may be more outbreaks. to prevent disease outbreaks caused by the return travel wave after the spring festival, the country has extended the spring festival holiday. correlation analysis showed that early incidence was closely related to the emigration waves from wuhan, that is, the higher the migrating population index, the larger chinese medical journal 2020;vol(no) www.cmj.org was the number of cases. this also proved that the first generation of cases in each province mainly came from wuhan. however, with the progress of the epidemic, migrants are spreading the virus to other people and are becoming an important source of local community transmission. therefore, it is necessary to strictly implement isolation and related control measures in accordance with the guidelines. particularly, control measures must be taken to prevent the spread of diseases in communities, which is crucial to prevent a large-scale outbreak. very soon, many company staff will return to their workplaces. because many enterprises in china are labor intensive, with large populations, human-to-human transmission is extremely easy. therefore, workers need to meet requirements for isolation after returning to the city and use personal protection at work to prevent clustered outbreaks. at present, there have been several reports of employee infections caused by resumption of work; these represent a warning for all enterprises. super megacities such as guangzhou, shenzhen, and shanghai, which have the largest number of migrant workers, need to be prepared for this. from february 16, the number of new cases began to decrease, but the epidemic did not stop completely. therefore, we must act together to stop the spread of the disease. at present, the state has adopted mobility control measures to encourage people to avoid going to public places and wear masks when going out to reduce the risk of human-to-human transmission. we believe that with the joint efforts made by everyone, the number of cases and losses will be kept to a minimum. the challenge of emerging and re-emerging infectious diseases risks to healthcare workers with emerging diseases: lessons from mers-cov, ebola, sars, and avian flu bats are natural reservoirs of sars-like coronaviruses origin of viruses: primordial replicators recruiting capsids from hosts first case of 2019 novel coronavirus in the united states a novel coronavirus from patients with pneumonia in china novel wuhan (2019-ncov) coronavirus drug treatment options for the 2019-new coronavirus (2019-ncov) emerging understandings of 2019-ncov early transmission dynamics in wuhan, china, of novel coronavirusinfected pneumonia space-time mixture modelling of public health data temporal and spatial analysis of neural tube defects and detection of geographical factors in shanxi province joint prior distributions for variance parameters in bayesian analysis of normal hierarchical models bayesian image restoration, with two applications in spatial statistics interpreting posterior relative risk estimates in disease-mapping studies the extent of transmission of novel coronavirus in wuhan, china, 2020 updated understanding of the outbreak of 2019 novel coronavirus (2019-ncov) in wuhan preparedness and proactive infection control measures against the emerging wuhan coronavirus pneumonia in china real time data report of epidemic situation severe acute respiratory syndrome-associated coronavirus infection preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak distribution of the covid-19 epidemic and correlation with population emigration from wuhan, china the authors thank andre kiesel for critical revision of this manuscript. none. key: cord-296669-1md8j11e authors: li, xin; lu, peixin; hu, lianting; huang, tianhui; lu, long title: factors associated with mental health results among workers with income losses exposed to covid-19 in china date: 2020-08-04 journal: int j environ res public health doi: 10.3390/ijerph17155627 sha: doc_id: 296669 cord_uid: 1md8j11e the outbreak and worldwide spread of covid-19 has resulted in a high prevalence of mental health problems in china and other countries. this was a cross-sectional study conducted using an online survey and face-to-face interviews to assess mental health problems and the associated factors among chinese citizens with income losses exposed to covid-19. the degrees of the depression, anxiety, insomnia, and distress symptoms of our participants were assessed using the chinese versions of the patient health questionnaire-9 (phq-9), the generalized anxiety disorder-7 (gad-7), the insomnia severity index-7 (isi-7), and the revised 7-item impact of event scale (ies-7) scales, respectively, which found that the prevalence rates of depression, anxiety, insomnia, and distress caused by covid-19 were 45.5%, 49.5%, 30.9%, and 68.1%, respectively. multivariable logistic regression analysis was performed to identify factors associated with mental health outcomes among workers with income losses during covid-19. participants working in hubei province with heavy income losses, especially pregnant women, were found to have a high risk of developing unfavorable mental health symptoms and may need psychological support or interventions. at the end of december 2019, the chinese city of wuhan reported a novel pneumonia caused by coronavirus disease 2019 (covid19) , an infectious disease caused by an acute severe respiratory syndrome coronavirus, which is rapidly spreading both domestically and internationally [1, 2] . on 30 january 2020, the world health organization (who) held an emergency meeting and declared the worldwide covid-19 outbreak a public health emergency of international concern [3] . the emergence and rapid increase in the number of covid-19 cases has posed and continues to pose complex challenges for global research, public health, and medical communities [4, 5] . as of 1 june 2020, there were more than 6.15 million confirmed cases of covid-19 across more than 215 countries and regions, including more than 372,130 deaths. with the rapid spread of covid-19, the local government in wuhan immediately adopted a city closure policy, encouraging citizens to work at home and teach online, and shut down non-essential services to mitigate the impact and risks of the disease. then, the governments of other provinces with low numbers of infected people in china and many other countries around the world entered states of emergency for the health response and issued a series of policies, including ordering citizens (regardless of having symptoms of infection or not) to self-isolate at home, and maintaining social distance from other people. however, concerns have arisen about the potential psychological impact of these measures [6] [7] [8] . studies proved that covid-19 has caused a high prevalence of mental health problems in china [8] [9] [10] [11] [12] and other countries around the world [13] [14] [15] [16] . some researchers have attempted to understand the outbreak of this novel coronavirus from a global health perspective [17] [18] [19] . however, most studies focused on the psychological effects of people who were infected with covid-19, medical workers, or people in specific regions [10] [11] [12] [13] [14] [15] 20] . studies showed that the economic impact caused by severe acute respiratory syndrome (sars) will produce psychological morbidities in individuals who are directly or indirectly exposed to life-threatening situations [21] . the occurrence of such psychological morbidities among workers can impact their daily functions and lead to immediate economic and physiological consequences, such as lost job productivity, depression, and anxiety [22, 23] . to the best of our knowledge, no previous study focused on mental health problems among people with income losses caused by covid-19. to address this gap, the aim of our study was to evaluate the mental health of chinese workers with income losses exposed to covid-19 by quantifying the degrees of depression, anxiety, insomnia, and distress, and analyze the potential risk factors related to these symptoms. in this study, besides age, sex and other demographic characteristics, participants from hubei province and outside hubei province were taken as the research objects for comparison of regional differences. the ultimate goal of this study was to assess the mental health burden of people with income losses during covid-19 and to provide guidance for the promotion of mental well-being among this population. this was a cross-sectional study conducted using an online survey and face-to-face interviews to assess mental health problems and their associations with income losses among chinese citizens who were exposed to coronavirus disease 2019 (covid-19) from 25 april to 9 may 2020. eligibility criteria included (i) currently living in china, (ii) aged 18 years or older, and (iii) with income losses caused by covid-19. participants were encouraged to participate in online surveys or complete offline questionnaires. a total of 421 of 600 contacted individuals completed the survey for a participation rate of 70.2%, and 23 people with no loss of income were excluded from the study. the final sample included 398 respondents, with a response rate of 66.3%. this study was approved by the ethics committee and institutional review board of wuhan university, wuhan, china (ref: 20200411), and conducted in accordance with the ethical guidelines of the declaration of helsinki of the world medical association. all data were deidentified before being provided to the investigators. consent from each participant was obtained at the beginning of the survey. the questionnaire consisted of 37 factors to record demographic indicators and symptoms of depression, anxiety, insomnia, and distress caused by covid-19 of the participants (see appendix a). the following demographic data were included in this study: sex (male or female), age (18-25, 26-30, 31-40 and >40 years old categories), educational level (0% to 25%, 25-50%, and >50% less than pre-epidemic income, respectively), and place of residence (urban or rural). mental disorders, including depression, anxiety, insomnia, and distress, caused by covid-19 were assessed in our study by chinese versions of validated measurement tools [24] [25] [26] [27] : the patient health questionnaire-9 (phq-9; the total score ranged from 0 to 27) [24] , the generalized anxiety disorder-7 (gad-7; the total score ranged from 0 to 21) [25] , the insomnia severity index-7 (isi-7; the total score ranged from 0 to 28) [26] , and the revised 7-item impact of event scale (ies-7; the total score ranged from 0 to 28) [27] . the response options are: 3 = nearly every day, 2 = more than half the days, 1 = several days, and 0 = not at all for phq-9 and gad-7; 4 = always, 3 = often, 2 = sometimes, 1 = rare, and 0 = never for isi-7 and ies-7. the total scores of these survey scales are interpreted as follows: phq-9, extremely severe (22-28), severe (15) (16) (17) (18) (19) (20) (21) , moderate (10) (11) (12) (13) (14) , mild (5) (6) (7) (8) (9) , and normal (0-4) depression; gad-7, severe (15) (16) (17) (18) (19) (20) (21) , moderate (10) (11) (12) (13) (14) , mild (5) (6) (7) (8) (9) , and normal (0-4) anxiety; isi-7, severe (22-28), moderate (15) (16) (17) (18) (19) (20) (21) , subthreshold (8) (9) (10) (11) (12) (13) (14) , normal (0-7) insomnia; and ies-7 severe (22-28), moderate (15) (16) (17) (18) (19) (20) (21) , subthreshold (8) (9) (10) (11) (12) (13) (14) , and normal (0-7) distress. the cutoff score for detecting possible major symptoms of depression, anxiety, insomnia, and distress caused by covid-19 are 10, 10, 15, and 15, respectively. a higher score indicates participants with greater self-reported severe symptoms [24] [25] [26] [27] . the psychometric properties and internal reliabilities of the 4 scales have been previously confirmed in chinese populations [24] [25] [26] [27] . in [24] , statistical tests were performed to determine the reliability and validity of phq-9. results showed that the internal consistency value of phq-9 was 0.854 and the test-retest reliability value of phq-9 was 0.873, proving the phq-9 is a valid and reliable tool to evaluate depression in chinese people. he [25] tested the reliability and validity of chinese version of gad-7. the results show that the cronbach 'α coefficient of gad-7 is 0.898, and the test-retest reliability coefficient is 0.856, proving the chinese version of gad-7 has good reliability and validity in the application of evaluating anxiety. doris s.f. yu [26] tested the reliability and validity of chinese version of isi-7, finding that cronbach's alpha of the chinese version of the isi-7 was 0.81, with item-to-total correlations in the range of 0.34-0.67. in [27] , chan reported that the cronbach 'α coefficient of ies-r is 0.89, which proved the ies-r is a valid and reliable tool to evaluate distress among chinese people. in our study, the cronbach's alpha coefficient of our questionnaire is 0.97. the cronbach's alpha coefficients of the chinese versions of phq-9, gad-7, isi-7 and ies-7 were 0.920, 0.945, 0.879 and 0.909, respectively. first, we used descriptive statistics to describe the socio-demographic characteristics of these participants. second, the prevalence rates of depression (phq-9 score ≥ 5), anxiety (gad-7 score ≥ 5), insomnia (isi-7 score ≥ 8), and distress (ies-7 score ≥ 8) were estimated. finally, multivariable logistic regression models were used to explore factors associated with depression, anxiety, insomnia, and distress among workers with income losses exposed to covid-19 in china, and the associations between risk factors and outcomes are presented as adjusted odds ratios (aors) with a 95% confidence interval (ci), after adjustment for confounders, including sex, age, marital status, educational level, working position, place of residence, degrees of income losses. data analysis was performed by spss statistical software (version 25.0, ibm corp., armonk, ny, usa,), with p-values < 0.05 indicating statistical significance. the significance level was set at α = 0.05, and all tests were two-tailed. as shown in table 1 , the proportion of men to women was close, at 50.5% and 49.5%, respectively, and the proportion of marital status (recoded into married and other including unmarried, widowed, and divorced) was similar to that of sex, at 49.5% and 50.5%, respectively. we classified their income losses caused by covid-19 as one of the demographic variables. response options were slightly affected (>0% to 25%), moderately affected (25-50%), and heavily affected (>50%). table 1 shows that the proportions of light, middle, and heavy income loss (>0% to 25%, 25-50%, and >50% lower income than pre-epidemic income, respectively) caused by covid-19 were 33.9%, 17.6%, and 48.5%, respectively. as hubei was most severely affected province by covid-19 in china, all 398 participants were grouped by their geographic location. the proportions in hubei province, and places outside hubei province were 44.2%, and 55.8%, respectively. most of these participants were aged from 26 to 40 years, lived in urban areas, and had a college degree or above. generally consistent with the existing covid-19 research results [8] [9] [10] , the prevalence rates of our participants who had symptoms of depression, anxiety, insomnia, and distress cause by covid-19 were 45.5%, 49.5%, 30.9%, and 68.1%, respectively. as shown in table 2 , multivariable logistic regression analyses showed that, after controlling for covariates, the adjusted odds of depression, anxiety, insomnia and distress were lower among participants who under 30 years (e.g., depression among participants aged 26-30 years: or = 0.228, 95% ci: 0.097-0.535, p < 0.001; depression among participants aged 18-25 years: or = 0.187, 95% ci: 0.072-0.489, p < 0.001) compared with who aged over 40 years, and greater among those working in hubei province (e.g., depression: or = 2.647, 95% ci: 1.662-4.217, p < 0.001) than outside hubei province. for the population whose income was heavily affected by covid-19, they were prone to experiencing mental symptoms of depression, anxiety, and insomnia (e.g., depression among participants with light income losses: or = 0.215, 95% ci: 0.124-0.371, p < 0.001). those from urban area had lower adjusted odds of depression anxiety, insomnia and distress than those from rural area (e.g., depression: or = 0.391, 95% ci: 0.226-0.675, p = 0.001). at the same time, being married (or, 3.348; 95% ci, 1.896-5.911; p < 0.001) was associated with a greater risk of feeling depressed than being unmarried. in sex statistics, we set an additional question (if you are a woman, please indicate whether you are pregnant). in this study, as shown in table 3 , multivariable logistic regression analyses showed that, after controlling for covariates, we found that pregnant women with income losses during covid-19 were associated with a greater risk of feeling depressed and anxiety (depression: or = 2.956, 95% ci: 1.208-7.229, p = 0.018; anxiety: or = 3.146, 95% ci: 1.217-6.133, p = 0.018) than unpregnant women (table 3) . table 2 lists the detailed results of phq-9 from multivariable logistic regression analysis; the results for the other scales are presented in supplementary materials (tables s1-s3). abbreviations: na = not available; aor: adjusted odds ratio; ci: confidence interval. phq-9: the patient health questionnaire-9. according to lai, j et al. [10] , the cutoff scores for detecting possible major symptoms of depression, anxiety, insomnia, and distress caused by covid-19 are 10, 10, 15, and 15, respectively. thus, the prevalence rates of our participants who had severe mental symptoms of depression, anxiety, insomnia, and distress were 19.1%, 21.9%, 7.8%, and 25.9%, respectively. similar to findings regarding prevalence of mental symptoms, as shown in table 4 , multivariable logistic regression analyses showed that, after controlling for covariates, the adjusted odds of severe symptoms of depression, anxiety, and distress were lower among participants who aged 26-30 years (e.g., severe depression: or = 0.243, 95% ci: 0.091-0.645, p = 0.005) compared with who aged over 40 years, greater among those with heavy income losses than light and middle income losses (e.g., severe depression among participants with light income losses: or = 0.246, 95% ci: 0.121-0.502, p < 0.001), and lower among those from urban area than those from rural area (e.g., severe depression: or = 0.337, 95% ci: 0.185-0.615, p < 0.001). for those working in hubei province, they were more prone to experiencing severe mental symptoms of anxiety and distress than those working outside hubei province. we enrolled 398 respondents and found a high prevalence of mental health symptoms among workers with income losses caused by covid-19 in china. this latest national sample indicated the prevalence rates of any disorder (excluding dementia), anxiety disorders, and depressive disorders were 16.6%, 7.6%, and 6.9% in china, respectively. compared with national data, we found much higher prevalence rates of participants with symptoms of depression, anxiety, insomnia, and distress caused by covid-19, at 45.5%, 49.5%, 30.9%, and 68.1%, respectively. our findings are consistent with those of previous covid-19 studies, including a study in mainland china that found that the prevalence of depression as measured during the covid-19 pandemic was 48.3% [8] and a study in hong kong that found that the prevalence of depression caused by covid-19 was 49.8% [9] . mental disorders, including depression, anxiety, insomnia, and distress, caused by covid-19 were assessed in our study by chinese versions of validated measurement tools [24] [25] [26] [27] : phq-9, gad-7, and isi-7. in our study, the cronbach's alpha coefficient of our questionnaire is 0.97. the cronbach's alpha coefficients of the chinese versions of phq-9, gad-7, isi-7 and ies-7 were 0.920, 0.945, 0.879 and 0.909, respectively, proving these scales have good reliabilities and validities in the application of evaluating mental disorders among chinese worker with income losses. by reviewing the literature, we found that these chinese scales are widely used in the study of psychological problems. especially recently, these four scales have been used to study covid-19. for example, researchers used them to assess the magnitude of mental health outcomes among healthcare workers treating patients exposed to covid-19 in china [10] , phq-9 and gad-7 were used to evaluate depression and anxiety in hong kong during the covid-19 pandemic [9] , and gad-7 was used to assess the prevalence of mental health problems and examine their association with social media exposure [8] . in this study, besides age, sex and other demographic characteristics, participants from hubei province and outside hubei province were taken as the research objects for comparison of regional differences. the proportions of respondents from hubei province and places outside hubei province were 44.2% and 55.8%, respectively. the proportions of light, middle, and heavy losses of income (>0 to 25%, 25-50%, and >50% less income than pre-epidemic levels, respectively) caused by covid-19 were 33.9%, 17.6%, and 48.5%, respectively. most of these participants were aged from 26 to 40 years, lived in urban areas, and had a college degree or above. we found that workers with heavy income losses caused by covid-19 reported more symptoms of depression, anxiety, and insomnia. compared with participants outside hubei province, those in hubei province reported higher scores on all four scales. the prevalence rates of our participants who had severe mental symptoms of depression, anxiety, insomnia, and distress were 19.1%, 21.9%, 7.8%, and 25.9%, respectively. our findings further indicated that pregnant women scored higher than non-pregnant women on phq-9 and gad-7 measuring symptoms of depression and anxiety. these findings are consistent with the previous studies' findings that exposure to a public health emergency can cause mental health problems. this study has several limitations. first, it was limited in scope. almost half of the participants (44.2%) were from hubei province, limiting the generalization of our findings to less affected regions. this survey was mainly conducted online, so some respondent bias, such as few elder citizens' participation, may have affected the results. second, the survey was conducted over two weeks and lacked longitudinal follow-up. it was hard to determine whether the mental health symptoms of workers with income losses could become more severe, so the long-term psychological implications of this population are worth further investigation. last, although the response rate of this study was 70.1%, response bias may still exist if the non-respondents were either too stressed to respond or not at all stressed and therefore not interested in this survey. in conclusion, our findings showed that relatively high prevalence rates of symptoms of depression, anxiety, insomnia, and distress were caused by covid-19. the prevalence of mental health problems among workers caused by covid-19 in china is high, especially those working in hubei province with heavy income losses. in addition, pregnant women with income losses were associated with a greater risk of feeling depressed and anxiety than other women, and may need psychological support or interventions. these results further indicate that the long-term psychological implications of this population are worth further investigation. supplementary materials: the following are available online at http://www.mdpi.com/1660-4601/17/15/5627/s1, table s1 : prevalence of anxiety and associated factors, table s2 : prevalence of insomnia and associated factors, table s3 : prevalence of distress and associated factors, table s4 : prevalence of severe anxiety and associated factors, table s5 : prevalence of severe insomnia and associated factors, table s6 : prevalence of severe distress and associated factors. the authors declare no conflicts of interest. the questionnaire consisted of 37 questions to record demographic indicators and symptoms of depression, anxiety, insomnia, and distress of all participants. demographic data the following demographic data were included in this study: sex (male or female), age (18-25, 26-30, 31-40, or >40 years categories), educational level (0 to 25%, 25-50%, and >50% less income than the pre-epidemic level, respectively), and place of residence (urban or rural). the english versions of the phq-9, gad-7, isi-7, and ies-r-7 scales were used in this study to measure the degree of symptoms of depression, anxiety, insomnia, and distress of our participants. early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia a novel coronavirus from patients with pneumonia in china emergency committee regarding the outbreak of novel coronavirus (2019-ncov) outbreak of pneumonia of unknown etiology in wuhan, china: the mystery and the miracle characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72314 cases from the chinese center for disease control and prevention risk perception and impact of severe acute respiratory syndrome (sars) on work and personal lives of healthcare workers in singapore: what can we learn? curating evidence on mental health during covid-19: a living systematic review mental health problems and social media exposure during covid-19 outbreak depression and anxiety in hong kong during covid-19 factors associated with mental health outcomes among health care workers exposed to coronavirus disease mental health problems during the covid-19 pandemics and the mitigation effects of exercise: a longitudinal study of college students in china impact of the covid-19 pandemic on mental health and quality of life among local residents in liaoning province, china: a cross-sectional study related health factors of psychological distress during the covid-19 pandemic in spain covid-19 and the fears of italian senior citizens epidemiological aspects and psychological reactions to covid-19 of dental practitioners in the northern italy districts of modena and reggio emilia the psychological impact of confinement linked to the coronavirus epidemic covid-19 in algeria the predictive capacity of air travel patterns during the global spread of the covid-19 pandemic: risk, uncertainty and randomness spatio-temporal patterns of the 2019-ncov epidemic at the county level in hubei province a comparison of infection venues of covid-19 case clusters in northeast china quarantine and isolation: how are quarantine and isolation different? available online coronavirus is more dangerous for the global economy than sars the economic impact of pandemic influenza in the united states: priorities for intervention the impact of covid-19 on tourist satisfaction with b&b in zhejiang, china: an importance-performance analysis validity and reliability of patient health questionnaire-9 and patient health questionnaire-2 to screen for depression among college students in china reliability and validity of a generalized anxiety scale in general hospital outpatients insomnia severity index: psychometric properties with chinese community-dwelling older people the development of the chinese version of impact of event scale-revised (cies-r) this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license key: cord-035307-r74ovkbd authors: liu, shuchang; ma, zheng feei; zhang, yutong; zhang, yingfei title: attitudes towards wildlife consumption inside and outside hubei province, china, in relation to the sars and covid-19 outbreaks date: 2020-11-11 journal: hum ecol interdiscip j doi: 10.1007/s10745-020-00199-5 sha: doc_id: 35307 cord_uid: r74ovkbd we designed a self-administered 20-item questionnaire to determine changes in attitudes towards wildlife consumption in chinese adults during the sars epidemic in 2002–2003 and on-going covid-19 pandemic that was first identified in december 2019. a total of 348 adults (177 males and 171 females) with a mean age of 29.4 ± 8.5 years participated, the majority (66.7%) from hubei. the percentages of participants who had eaten wildlife significantly decreased from 27.0% during sars to 17.8% during covid-19 (p = 0.032). the most common reason participants provided for consuming wildlife was to try something novel (64.9% during sars and 54.8% during covid-19). more than half of participants (≥53.5%) reported that they had stopped eating wildlife meat because most species of wildlife are legally protected. our study results indicate over the period between the sars epidemic to the outbreak of the covid-19 pandemic, attitudes towards the consumption of wildlife in china have changed significantly. in november 2002, an epidemic of severe acute respiratory syndrome (sars) centred in foshan municipality, guangdong province, was identified, which peaked in february 2003 (evans et al. 2003) . early cases reported that patients positive for sars lived near animal markets, and nearly half of them were food practitioners who had contact with animal products. after 17 years, in december 2019, a novel coronavirus pneumonia outbreak was reported in wuhan, hubei province (fig. 1) . the coronavirus disease 2019 outbreak was traced to the huanan seafood market, and most of the early diagnosed patients had been to the local fish and wildlife market before the outbreak ). the fish and wildlife market also sold live animals such as poultry, bats, marmots, hedgehogs, badgers, birds, and snakes wu et al. 2020) . since both outbreaks have been linked to wildlife markets (li and davey 2013; lu et al. 2020; wu et al. 2020) , it is important to explore the changes of attitude towards eating wildlife before and after the two outbreaks in the general population. both the sars and the on-going covid-19 outbreaks have had extremely negative impacts worldwide. the world health organization (who) recorded >8400 cases of sars and 800 deaths worldwide (zhong et al. 2003) . the covid-19 outbreak has also led to serious consequences including unprecedented levels of infection and deaths, decreased quality of life, and increased stress due to strict lockdowns and limits on social interactions ma et al. 2020; zhang and ma 2020a,b) . in mainland china, the number of diagnosed patients from 21st january 2020 until 2nd february 2020 increased from 330 cases to 17,000 cases within two weeks (dong et al. 2020) . hubei province, and especially its capital city wuhan, have been significantly affected since wuhan was the epicentre of the covid-2019 outbreak. the total number of covid-19 cases in hubei province had reached 67,800 as of march 28th. meanwhile, the total number of covid-19 cases in other chinese provinces had reached 13,600 (maier and brockmann 2020) . the two epidemics began in areas with populations with a preferences for consumption of wildlife (sun et al. 2020) , which has been identified as the source of both the outbreaks, and covid-19 has been reported as having a probable origin in bats (zhou et al. 2020) . viruses usually need intermediate hosts to spread from bats to humans (sun et al. 2020) , and some wildlife species such as pangolins are reported to act intermediate hosts of severe acute respiratory syndrome-coronavirus 2 (sars-cov-2). the virus may pass onto humans when they consume wildlife meat, and subsequently may lead to the risk of human-to-human transmission (zhang et al. 2020) . however, published research related to the attitudes regarding the wildlife consumption during both the ongoing covid-19 pandemic and the sars outbreak of 17 years ago is very limited, illustrating the general lack of sufficient scientific attention to the safety of and attitudes towards consuming wildlife worldwide (wei 2020) . therefore, our aim in this study was to determine changes in attitudes towards wildlife consumption in chinese adults in relation to the sars and covid-19 outbreaks with a particular focus on hubei province. this is because hubei province, especially its capital city wuhan, has been significantly hit by the covid-19 pandemic. our findings from this study have important implications for public health, especially relating to the current dietary habit of consuming wildlife meat in china and elsewhere, and provide a basis for future studies to develop more effective prevention and treatment strategies. we conducted a cross-sectional study between 7 april 2020 and 20 april 2020 by using convenience sampling. inclusion criteria included: non-pregnant individuals of chinese nationality aged ≥18 years and currently living either in or outside hubei province, china, who were living in same province during both the sars and covid-19 outbreaks. no financial rewards were given to participants for completing the questionnaire. all participants provided informed consent prior to the study enrolment. the study had obtained the approval from the ethics committee of the jinzhou medical university (ref. no. jydll2020002). in addition, our study protocol was conducted according to the provisions of the declaration of helsinki (as revised in edinburgh 2000). the questionnaire comprised a total of 20 related questions including 11 eliciting basic socioeconomic information such as sex, age, education, job type, marital status, religion, and city of residence. we also asked participants if their employment was related to healthcare professions. additionally, participants were also asked to indicate whether they or their friends/relatives were currently diagnosed with covid-19. there were five questions each for sars and covid-19. furthermore, participants were asked what they would do if they see someone hunting illegally. the questionnaire was distributed via wechat, qq, and baidu post bar. in the sars and covid-19 sections of the questionnaire, we asked participants whether they had ever eaten wildlife such as palm civets, snakes, wild boar, frogs, monkeys, bats, or pangolins during the outbreaks. if they answered yes, they were asked to select their reason for eating wildlife, including "i eat wildlife for nutrients," "i eat wildlife to test something novel," "i eat wildlife because they taste good," or "i eat wildlife because they are expensive, and they signify my social status." if they answered no, they were asked to select their reasons for not eating wildlife, including "i do not eat wildlife because i dislike eating wildlife," "i do not eat wildlife because they are protected by law," "i do not eat wildlife because they are too expensive," and "i do not eat wildlife because it is hard to buy wildlife in the local markets." we then provided four choices for participants reflecting whether or not their opinion had changed about eating wildlife since the sars outbreak: "i eat wildlife whenever i get the chance," "i have stopped eating wildlife meats because wildlife are legally protected," "i will only eat wildlife meats after they are inspected by food inspectors," and "i had another reason," which they were asked to state specifically. we also included questions as to whether participants considered palm civets to be carriers of sars, and bats to be carriers of sars-cov-2. statistical analyses were performed using spss ver. 25 (spss, chicago, il). differences were considered statistically significant when a p value was <0.05. difference between sex and age were determined using an independent t-test. a significant relationship between two categorical variables were analysed with a chi-square test. all results of quantitative variables were presented either as frequency (percentage) (%) or mean ± standard deviation where appropriate. the online questionnaire was completed by 348 chinese adults and of these, 66.7% (231/348) were from hubei province and 35.3% (123/348) were from wuhan city ( table 1 ). the mean age of participants was 29.4 ± 8.5, with no difference in mean age between men and women (p = 0.873), and 95.7% of participants were under 50 years old. the majority of participants (81.9%) had a higher education qualification level. about one third of participants (37.1%) were married. none of the participants in the study was currently diagnosed with covid-19; only two participants indicated that they had friends who had been diagnosed with covid-19; 92.0% of participants declared they had no religious belief, and 6.3% indicated they were buddhist. in addition, 97.1% were of han ethnicity, while man and hui accounted for 1.1% and 1.1%, respectively. the percentages of participants who had ever eaten wildlife were much lower than those who had not eaten, both during the sars (27.0% vs. 73.0%) and covid-19 (17.8% vs. 82.2%) outbreaks (tables 2 and 3 ). however, the percentages of participants who consumed wildlife differed significantly during two outbreaks (p = 0.032), as 27.0% of participants reported that they consumed wildlife before sars and only 17.8% had eaten wildlife before covid-19. for those who had eaten wildlife, the most common reason was to test something novel, 64.9% during the sars and 54.8% during the covid-19 outbreaks, respectively. interestingly, no one consumed wildlife because of the expense signified their social status. for those who had never eaten wildlife, the two most common reasons were dislike of eating wildlife (47.7% during sars and 39.9% during covid-19) and because most species of wildlife are protected by law (43.3% during sars and 52.5% during covid-19). education level was significantly associated with wildlife consumption, both during the sars and covid-19 outbreaks (p = 0.002 and p < 0.001, respectively). additionally, only during the sars outbreak, there were significant differences in the percentages of wildlife consumption between males and females (10.5% and 42.9%, respectively) (p < 0.001). however, there was no difference in the percentage of participants living inside or outside hubei who consumed wildlife during the two outbreaks (p = 0.669 and p = 0.620, respectively). overall, the majority of participants reported that during the covid-19 outbreak they stopped eating wildlife and/or did not eat it because they were legally protected species (67.0%), followed by "only eat inspected wildlife meat" (24.4%) and "eat when got opportunity" (5.2%) (tables 2 and 4) . similarly, majority of participants reported that during the sars outbreak, they stopped eating wildlife and/or did not eat it because they were legally protected species (53.5%), followed by "only eat inspected wildlife meat" (40.0%) and "eat when got opportunity" (2.9%) (tables 3 and 5) . those who chose "other reasons" indicated that their attitudes towards not eating wildlife had never changed (3.7% during covid-19 and 3.5% during sars). there were significant differences in the perceptions of eating wildlife during sars and covid-19 between participants living in hubei and those living outside hubei (p = 0.007 and < 0.001, respectively) (tables 4 and 5). participants living in hubei indicated that they changed their opinion during sars mainly because wildlife were legally protected (59.1%), followed by "only eat inspected wildlife meat" (35.8%), and "eat when got opportunity" (3.4%). however, when it came to the covid-19 outbreak, the percentages changed to 64.7%, 26.7%, and 6.9%, respectively. participants outside hubei changed their opinion to "only eat inspected wildlife meat" (48.3%), followed by stop eating wildlife because they were legally protected (42.2%), and "eat when got opportunity" (1.7%) during sars. these percentages changed to 19.8% ("only eat inspected wildlife meat"), 71.6% ("stop eating wildlife were legally protected"), and 1.7% ("eat when got opportunity"), respectively during covid-19. there were significant differences in the perceptions of eating wildlife between participants who had higher educational qualifications and participants with secondary education level during sars and covid-19. the percentages of them choosing "stop eating wildlife were legally protected", "only eat inspected wildlife meat" and "eat when got opportunity" were 47.0%, 46.3%, and 2.8% respectively for participants who had higher educational qualifications compared to 82.5%, 11.1%, and 3.2% for participants with secondary education level, respectively, during sars. on the other hand, the percentages of those choosing "stop eating because wildlife were legally protected," "only eat inspected wildlife meat," and "eat when got opportunity" were 72.6%, 18.9% and 4.9% for participants who had higher education compared to 41.3%, 49.2% and 6.3% for participants with secondary education, respectively during covid-19 (all p < 0.001). there were no differences in the percentages of those changing of their opinion about eating wildlife between males and females and different age groups during covid-19 and sars (all p > 0.05). more than half of the participants (53.7%) thought that palm civets were carriers of sars, while only 14.7% indicated they did not think that palm civets were carriers of sars, and about one-third (32.2%) indicated they did not know. in addition, nearly half the participants (42.2%) agreed that bats were carriers of sars-cov-2. furthermore, significantly more female participants agreed that bats were carriers of sars-cov-2 than male participants (55.0% vs. 29.9%) (p < 0.001) ( our study results clearly indicate that chinese attitudes towards eating wildlife have changed significantly between the 2002-2003 sars outbreak and the december 2019 ongoing covid-19 outbreak. the percentages of participants who had eaten wildlife decreased from 27.0% during sars to 17.8% during covid-19 (p = 0.032). this showed that the chinese population's attitudes towards eating wildlife have significantly altered over the past 17 years, which may be due to the fact that sars outbreak encouraged greater vigilance and reflection on the dangers inherent in wildlife meat consumption. in addition, there are currently many non-governmental organizations organizing activities to further protect wildlife (yuan et al. 2020 ). there were significant differences in opinions about eating wildlife during sars and covid-19 between participants from hubei and participants outside hubei. approximately three-fifths of participants from hubei chose not to eat wildlife because most wildlife species are legally protected. approximately half of participants outside hubei chose only to eat inspected wildlife meat. from the sars outbreak to the covid-19 outbreak, the changes in the opinion of participants from hubei and outside hubei were reflected in the fact that the participants who only consumed wildlife that had been inspected during sars indicated that they stopped eating wildlife during covid-19. only 47.8% of participants from hubei agreed that palm civets were carriers of the sars virus, which was lower than those participants outside hubei (63.8%) (p = 0.005). this may be because since the main outbreak area of sars was not concentrated in hubei so that of outbreak may not have had such a profound impact among participants from hubei (evans et al. 2003) . in addition, our results indicate that education level significantly affected attitudes towards wildlife consumption. interestingly, during sars, 30.5% of participants with higher education qualifications indicated they consumed wildlife, which was more than twice that of participants without higher education (11.1%) (p = 0.002). the percentages of participants with higher education who thought that palm civets were sars carriers were more than twice as high as those without higher education (p < 0.001). however, during covid-19, the percentages of participants with secondary school education who consumed wildlife were three times that of participants with higher education. at the same time, these two groups also reflected significant changes in perceptions of eating wildlife (p < 0.001). from sars to covid-19, participants with higher education who indicated they chose to "stop eating because wildlife are legally protected" increased from 47.0% to 72.6%. usually, wildlife meat is sold for higher prices because of its scarcity. consumers with higher income and higher education level were reported to have higher consumption rates of wild animals (zhang and yin 2014) . additionally, consumers with higher education levels usually have a higher income. therefore, this may explain why there were higher percentages of participants with higher education levels who consumed wild meat than those with secondary education level during sars (30.5% vs. 11.1%). however, during covid-19, there were fewer participants with higher education who consumed wildlife than those with secondary level education. it is possible that participants with higher education levels might have become more aware of the risks associated with wildlife consumption, especially after sars and covid-19. china's per capita consumption of meat quadrupled from 1978 to 2002 (liu and diamond 2005) . however, meat production cannot keep up with china's growing appetite for animal products cannot (machovina et al. 2015) . eating wildlife may be a way to increase sources of protein (asibey 1974) . the consumption of wildlife is not uncommon in many parts of the world, including america, africa, and asia, and in many cases is a very important part of cultural identify (lindsey et al. 2013; volpato et al. 2020) . however, the chinese population currently have abundant choices for sources of protein. in our study, more than half of the participants indicated that they ate wildlife meat because they wanted to try something novel, and secondly that they like its taste. only a small number of participants (16.0% during sars and 9.7% during covid-19) thought that wildlife meat has special nutritional value. this seems to indicate that wildlife meat rather than being a necessary source of protein for the chinese population is nowadays simply a matter of personal choice. it is worth noting that during the two outbreaks, none of the participants reported that they consumed wildlife because the expensive price signified their social status. if laws related to the protection of wildlife are tightened and strictly enforced, and cutting off the supply of wildlife in markets, then the cost of eating illegally hunted wildlife will increase. the chinese population will then find it increasingly difficult to find opportunities to consume wildlife. thus, the number of individuals who eat wildlife to satisfy their curiosity would also be greatly reduced. at the same time, the dangers inherent in the consumption of wildlife meat, especially if the source is unknown, should be widely publicized. as has become clear during the ongoing covid-19 outbreak, some species of wildlife carry viruses that can cross barriers between species and mutate to become dangerous and potentially fatal to humans (volpato et al. 2020) . also noteworthy is that more than half of our study participants (53.7%) indicated that they thought that palm civets were carriers of sars. however, fewer than half of the participants (42.2%) thought that bats were the carriers of sars-cov-2. this may be because, at the time the questionnaire was circulated, the covid-19 outbreak was so recent. when compared to the sars outbreak of 17 years ago, participants may not have had enough knowledge and familiarity with covid-19. however, since the habit of consuming wildlife is acquired over a long period, a gradual approach to improving eating habits should be adopted, since it is neither feasible to force the chinese population to change their dietary habits just after the pandemic outbreak, nor would it likely produce the desired outcomes. in our study, some participants reported that they would continue to consume wildlife meat, which indicates that there is still demand for wildlife meat. after the covid-19 outbreak, chinese government shut down wet markets (markets for live or freshly slaughtered animals), but this clearly did not eliminate demand, and may in fact lead to the wildlife trade continuing underground (volpato et al. 2020) . it is more realistic to provide a greater variety of food choices in the markets. for example, most of our study participants (64.9% during the sars and 54.8% during the ongoing covid-19 outbreaks, respectively) consumed wildlife meat because they felt that the wildlife meat was novel and they had the opportunity to acquire it. if qualified enterprises can breed some of these wild species, with the same safety guarantees as currently domesticated farm animals, this could provide an alternative safe option for those who continue to favour wildlife consumption. it might be easier to achieve with better results rather than attempting to enforce a blanket ban on wildlife consumption. furthermore, while protecting the original environment of endangered wildlife species is important, intensive breeding for reintroduction or even meat production is also a useful strategy (leader-williams et al. 1991) . the chinese government has in fact implemented a series of measures, including amending the wildlife protection law and captive breeding of wildlife, to further enhance wildlife protection (wang et al. 2019) . it was encouraging that the majority of our study participants (91.1%) indicated that they would stop or try to stop illegal hunting, with more than a quarter saying they would firmly stop illegal hunters, and only 8.9% indicated they would not take any action. the covid-19 outbreak has led to lockdown for months, greatly affecting the lives of the whole nation and the whole world (yuan et al. 2020) . it is hoped that the serious consequences of this covid-19 pandemic will alert the chinese population to the importance of environmental protections. a significant strength of our study is that it is one of the first to investigate the impact of covid-19 on wildlife consumption and compare the results with the earlier sars outbreak. furthermore, since we especially targeted participants from hubei province, and more than a half of the hubei participants were living in wuhan, the epicentre of the covid-19 outbreak, we had the opportunity to determine whether there were differences in the attitudes of wildlife consumption between residents from both inside and outside the epicentre of the covid-19 outbreak. one limitation of our study is potential recall bias, because participants might have had difficulty recalling details from the sars period 17 years ago. another limitation is the use of the convenience sampling method. in addition, the translation of some english words and western understandings such as "wildlife" could be problematic in china because of different historical rationales for eating wildlife in chinese and western conceptions and cultures. therefore, our findings should be interpreted cautiously. in conclusion, in the 17 years from the sars to covid-19 outbreaks, the proportion of chinese adults consuming wildlife has decreased significantly. at present, chinese populations seem to be in favour stopping wildlife consumption and fighting against illegal hunting. however, it is likely that some people in china will continue to consume wildlife meat for a number of reasons including believed health benefits. funding this research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. data availability the datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request. wildlife as a source of protein in africa south of the sahara evaluation of lifestyle, attitude and stressful impact amid the covid-19 pandemic among adults in an interactive web-based dashboard to track covid-19 in real time wildlife production systems: economic utilisation of wild ungulates culture, reform politics, and future directions: a review of china's animal protection challenge the bushmeat trade in african savannas: impacts, drivers, and possible solutions china's environment in a globalizing world outbreak of pneumonia of unknown etiology in wuhan, china: the mystery and the miracle increased stressful impact among general population in mainland china amid the covid-19 pandemic: a nationwide cross-sectional study after wuhan city's travel ban lifted biodiversity conservation: the key is reducing meat consumption effective containment explains sub-exponential growth in confirmed cases of recent covid-19 outbreak in mainland china potential factors influencing repeated sars outbreaks in china baby pangolins on my plate: possible lessons to learn from the covid-19 pandemic captive breeding of wildlife resources-china's revised supply-side approach to conservation food safety issues related to wildlife have not been taken seriously from sars to covid-19 a new coronavirus associated with human respiratory disease in china regulating wildlife conservation and food safety to prevent human exposure to novel virus wildlife consumption and conservation awareness in china: a long way to go impact of the covid-19 pandemic on mental health and quality of life among local residents in liaoning province, china: a cross-sectional study psychological responses and lifestyle changes among pregnant women with respect to the early stages of covid-19 pandemic willingness of the general population to accept and pay for covid-19 vaccination during the early stages of covid-19 pandemic: a nationally representative survey in mainland china epidemiology and cause of severe acute respiratory syndrome (sars) in guangdong, people's republic of china a pneumonia outbreak associated with a new coronavirus of probable bat origin publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations conflict of interest the authors declare that they have no conflict of interest. the present study was approved by jinzhou medical university research ethics committee involving human beings. (ref. no. jydll2020002). in addition, our study protocol was conducted according to the provisions of the declaration of helsinki (as revised in edinburgh 2000). all participants were briefed about the study protocol and informed consent was obtained from them. key: cord-309032-idjdzs97 authors: zhou, feng; you, chong; zhang, xiaoyu; qian, kaihuan; hou, yan; gao, yanhui; zhou, xiao-hua title: epidemiological characteristics and factors associated with critical time intervals of covid-19 in eighteen provinces, china: a retrospective study date: 2020-10-09 journal: int j infect dis doi: 10.1016/j.ijid.2020.09.1487 sha: doc_id: 309032 cord_uid: idjdzs97 background as covid-19 ravages continuously around the world, more information on the epidemiological characteristics and factors associated with time interval between critical events is needed to contain the pandemic and to assess the effectiveness of interventions. methods individual information on confirmed cases from january 21 to march 2 was collected from provincial or municipal health commissions. we identified the difference between imported and local cases in the epidemiological characteristics. two models were established to estimate the factors associated with time interval from symptom onset to hospitalization (toh) and length of hospital stay (los) respectively. results among 7,042 cases, 3392 (48.17%) were local cases and 3304 (46.92%) were imported cases. since the first intervention was adopted in hubei on january 23, the daily reported imported cases reached a peak on january 28 and gradually decreased since then. imported cases were on average younger (41 vs. 48), and had more male (58.66% vs. 47.53%) compared to local cases. furthermore, imported cases had more contacts with other confirmed cases (2.80 ± 2.33 vs. 2.17 ± 2.10), which were mainly within family members (2.26 ± 2.18 vs. 1.57 ± 2.06). the toh and los were 2.67 ± 3.69 and 18.96 ± 7.63 days respectively, and a longer toh was observed in elderly living in the provincial capital cities that were higher migration intensity with hubei. conclusions measures to restrict traffic can effectively reduce imported spread. however, household transmission is still not controlled, particularly for the infection of imported cases to elderly women. it is still essential to surveil and educate patients about the early admission or isolation. as of september 20, 2020, a total of more than 30 million confirmed cases of coronavirus disease 2019 , as well as more than 900,000 deaths had been reported by world health organization (who) in the worldwide (organization, 2020a) . at the same time, china had reported 85,291 lab confirmed cases with 4,634 deaths (china national health commission of the people's republic of, 2020a). despite the who and international community declared and took many efforts to control this pandemic in time, our knowledge about the covid-19 is still very limited, and the number of daily reported cases is still increasing sharply worldwide (organization, 2020b) . in the context of the rapid spread of covid-19, a full understanding of the epidemiological characteristics of this infectious disease is crucial in epidemic control and public policy practices. several studies conducted in china, italy and the united states have reported some epidemiological characteristics of covid-19 in the initial phase (grasselli et al., 2020 , liang et al., 2020 , price-haywood et al., 2020 , richardson et al., 2020 , wu and mcgoogan, 2020 , however, there is still a lack of research on the space-time characteristics in the populations of imported and local cases respectively which is of great significance. imported cases play a very important role in the disease spreading, especially it is an indicator for predicting new clusters of infections. understanding its epidemiological characteristics would help us to assess the possible effect of non-pharmaceutical interventions (npis), such as travel restrictions (desjardins et al., 2020 , gilbert et al., 2020 . furthermore, considering the changes in susceptible populations, exposure opportunity and intervention of disease over epidemic progresses and locations, the epidemiological characteristics of disease should hence be estimated spatiotemporally in order to better describe the epidemic (zhang j. et al., 2020) . for example, the space-time characteristics of covid-19 revealed by previous studies can prioritize locations and the best time for different npis (desjardins et al., 2020 , lai s. et al., 2020 , masrur et al., 2020 . therefore, exploring the epidemiological characteristics of j o u r n a l p r e -p r o o f imported cases from a space-time perspective is critical and provides guidance for countries on interventions taken at different periods and regions, specifically in resource-scarce countries and regions. as a highly contagious disease, early detection, isolation, hospitalization and diagnosis of covid-19 are also important for control and they can effectively reduce the risk of disease transmission (bi et al., 2020 , rong et al., 2020 , thompson, 2020 . delay in hospitalization or isolation may lead to prolonged periods of infectiousness, and increase the difficulty and burden of infectious disease control. previous studies have described some characteristics of patients with covid-19 including the time interval between key events (liang et al., 2020 , tian et al., 2020 . in addition, existing literature also brought to light the reduction in the time interval from symptom onset to hospitalization/isolation after various interventions , zhang j. et al., 2020 . however, little is known about individual-level influence factors associated with delaying hospital admission and length of hospital stay. identifying these factors would not only help us predict the medical burden and reasonably allocate medical resources, but also would inform response efforts across the world. in this study, we described the spatiotemporal distribution of the covid-19 in eighteen provinces of china (outside hubei province) and investigated the epidemiological characteristics in the population of imported cases and local cases, from the beginning of this epidemic until it was under good control. we further assessed the critical influence factors associated with time interval from symptom onset to hospitalization (toh) and length of hospital stay (los), including demographic and temporal and spatial characteristics. j o u r n a l p r e -p r o o f we constructed a retrospective cohort study for covid-19 confirmed cases, based on the detailed information published by the provincial or municipal health commissions in eighteen provinces of china (outside hubei province) from january 21 to march 2. the details of sampling and data collection are shown in figure 1 . data collectors were trained and divided into five groups of two according to provinces to collect timely epidemiological data of confirmed cases. linkmed edc were used for data entry, the two collectors in each group entered the same data, and we conducted data verification and consistency test in real-time. specifically, demographic characteristics, epidemiological history and date of critical event were extracted from the official report of the confirmed case details. (1) demographic information including age, gender, residence at the time of diagnosis and type of symptoms were included in our analysis. (2) epidemiological history includes history of travel or residence in other regions and contact history of confirmed cases. according to whether the patient had a travel or residence history in other regions within 14 days before diagnosis and likely exposure to pathogens in that regions, the patient was divided into imported and local cases. similarly, we can identify whether patients had contacted with confirmed cases of family and non-family members. (3) the dates of events include the date of symptoms onset, hospitalization/isolation, cdc diagnosis and recovery/death. hospitalization/isolation is defined as a patient receiving regular hospital treatment (not includes small medical institutions such as clinics and community health service centers), or a mandatory isolation measure implemented by the community. in this study, we used the time interval between two events to analyze this data, including time interval from symptom onset to hospitalization (toh) and length of hospital stay (los). additionally, we also collected information on the intensity of migration from hubei to these 18 provinces in the week before january 23, which was obtained from the baidu j o u r n a l p r e -p r o o f migration map (baidu, 2020) . migration intensity between provinces and hubei was categorized into four levels: strong connection (≥0.15%), medium connection [0.05%-0.15%), weak connection [0.03-0.05%) and very weak connection (<0.03). finally, according to the daily trend of new cases and date of intervention, we divided the entire epidemic into five periods from the beginning of the epidemic (jan 21) to mar 2. the first period is before january 23, when wuhan took measures of traffic restrictions and lockdown, since then every week works as one period, until the last period is a recession of this epidemic after february 14. we described the epidemic scale in 18 provinces and the proportion of imported cases spatiotemporally. meantime, the demographic characteristics of imported and local cases were reported. in addition, two models were established to identify and quantify the relevant sociodemographic factors to toh and los respectively. in the first model, we estimated the factors associated with toh using a generalized linear model with a poisson distribution and a log link. besides, the odds ratio (or) and their 95% confidence intervals (ci) were calculated after incorporating multiple variables (coxe et al., 2009 , sas, 2016 . in the second model, an accelerated failure time (aft) model was used to handle the survival data with both left and right censored (kalbfleisch, 2002 , paul, 2010 . in our study of analyzing factors associated with los, left censoring would occur if we know that a patient recovered before marth 2, but the exact time cannot be obtained. similarly, right censoring would occur for patients who are confirmed in the later phase of the epidemic. moreover, we included the toh in the model and used the hazard ratio (hr) and their 95% cis to identify the difference in los among recovered patients with different characteristics. based on the distribution of los which is denoted by t, we established the weibull model, written as, where ε is a random disturbance term, and β 0 ,...,β , and σ are parameters to be estimated. then we applied a likelihood function with censored to estimate the parameter values. inc., north carolina, usa). p<0.05 was considered statistically significant. among 7,042 cases, 3392 (48.17%) of patients were local cases and 3304 (46.92%) of patients were imported cases, and less than 5% (346) of other patients were unable to confirm their travel history within 14 days before diagnosis. the temporal and spatial distribution of imported and local cases is shown in figure 2 . from panel a, we can see that the greater the intensity of migration with hubei, the more cases in the province. for provinces with migration intensity greater than 0.03%, the proportion of imported cases to total cases was about 50%. however, for provinces including tianjin, ningxia and hebei with very weak connection (<0.03%) with hubei, they had more local cases than imported cases. since the first intervention was adopted in hubei on january 23, the daily reported imported cases reached the highest on january 28, and the proportion of imported cases to the total cases gradually decreased over time, reaching 50% on february 2 ( figure 2b ). j o u r n a l p r e -p r o o f 44.13%). for time interval, the frequency and best-fitting probability density function for toh and los are present in figure 3 respectively. as shown in the top half of the left panel of table 3 shows the results of the first model for the influence factors of toh. a longer toh was observed in older and provincial capital cases. the older the case is , the longer the toh. as compared with the cases younger than 20, especially for cases older than furthermore, patients who lived in regions with lower migration intensity with hubei province had shorter toh. particularly, as for patients living in regions where had the migration intensity more than 0.15%, migration intensity (1) between 0.05% and 0.15%, had down to 0.87 times decreased risk of longer time, (2) between 0.03% and 0.05%, had down to 0.74 times, (3) less than 0.03%, had down to 0.69 times. in addition, there is no significant differences in toh between imported and local cases. the right panel of table 3 gives the hr estimates of related factors associated with los. there were no significant differences in los among different gender or age groups. it also showed that differences in los relative to city type and fever symptoms were not statistically significant. while, patients clearly contacted with family-confirmed case had a longer los (hr=1.05; 95% ci: 1.01,1.09) than patients who did not clearly contact. moreover, we found j o u r n a l p r e -p r o o f that local patients had a shorter hospital stay than imported cases (hr=0.95; 95% ci: 0.91,0.99). furthermore, patients reported in the later periods of this epidemic had a shorter hospital stay than patients in the initial epidemic (hr=0.66; 95% ci: 0.57,0.77). compared with patients whose toh was less than or equal to one day, los of patients whose toh was more than 4 days was reduced by 0.05 percentage. and the similar result appeared in patients whose toh was 2-3 days (hr=0.94; 95% ci: 0.89,0.99). comprehensive epidemiological characteristics of the covid-19 covering the entire periods of epidemic and summaries of the experience from china are useful in public health control. in this study, we described the epidemiological characteristics of imported and local cases, including temporal and spatial characteristics. indeed, regions with greater migration intensity with hubei had more imported cases. after the lockdown measures taken by cities in hubei since january 23 towards the interruption of sustained covid-19 transmission outside hubei province (nie et al., 2020) . we found the daily reported imported cases reached a peak on january 28 and gradually decreased since then. these suggest that traffic restrictions or lockdown in the epicenter can effectively reduce the export of cases (islam et al., 2020 , zhang j. et al., 2020 . moreover, outside of the epicenter, it is also obvious that timely restriction and quarantine of suspicious imported individuals with a travel history of epicenter can effectively reduce the transmission by imported cases in local , kwok et al., 2020 , lai c. k. c. et al., 2020 . even in the provinces that were not in close contact with hubei, the surveillance of imported cases could not still be overlooked. taking tianjin, ningxia and hebei province as examples, local cases were twice as large as imported cases, which was related to the several local gathering events of imported cases , dong et al., 2020 , zhang s. x. et al., 2020 . this study confirms previously described characteristics (liang et al., 2020, wu and mcgoogan, 2020) , but also highlights the difference between imported and local cases. throughout this epidemic, imported patients focused on younger, had a higher proportion of male and had more provincial capital residents compared to local cases. this may match the situation that labor exports are mainly the young and middle-aged male in china. this result also insinuates older women living in non-provincial capital cities were at greater risk of exposure when the epidemic spreads to the local. a study on household transmission also founded similar results (xu et al., 2020) . moreover, the proportion of clearly confirmed case contact history in local cases was higher than that in imported cases. this may be due to the complicated epidemic chain in hubei province in the initial phase of the epidemic, which made it difficult to track the contact history of imported cases. nonetheless, approximately 40% of local cases may be attributed to the household transmission. among the patients who were clearly exposed to confirmed cases, imported cases had more contacts with other confirmed cases than local cases on average, and contacts were mainly family members. although we are unable to determine the infectious relationship between them, it might partly explain household transmission caused by imported cases was more prominent. this suggests that after npis such as restricting population movement were taken. more effective interventions were still needed to be taken to control household transmission simultaneously, especially for the infection of imported cases to elderly woman in non-provincial capital cities. indeed, the chinese government encouraged people to stay at home as much as possible (lai s. et al., 2020) . while, the cases that have migrated out from hubei before january 23 still have the risk of household transmission in local. therefore, emergency measures were taken by local governments across china to strengthen the tracking and isolation of recent travelers from hubei (china national health commission of the people's republic of, 2020b, china the state council of the people's republic of, 2020), which reduced this risk to a certain extent. moreover, our study showed that the daily local cases reached a peak on the 14th day (february 6) after the lockdown, and then gradually declined. this also illustrates the early response of the government is very important for containing the local spread of imported cases. our findings show that there was a lag of 2.67 days from symptom onset to hospital admission, and the average length of hospital stay was about 19 days, which were similar to previous studies conducted in china (khalili et al., 2020 , liang et al., 2020 , linton et al., 2020 . surprisingly, we found that the older the patients are, the longer the hospitalization delays. considering the situation that medical resources outside hubei province had not reached saturation, this might be related to the hospital admission pattern of viral respiratory diseases or the lack of recognition of the disease in elderly patients (petrilli et al., 2020) . besides, the toh at the later phase of the epidemic showed a rebound trend. cases reported in the later phase of the epidemic had a slack attitude in seeking medical resources and the decline in control efforts were possible reasons. however, research in china (outside hubei province) during january 21 to february 17 demonstrated a shorter hospital admission delay from january 28 to february 17 (4.4 vs. 2.6 days) (zhang j. et al., 2020) . before adjusting for other factors, our research also showed a slightly shorter hospital admission delays in the week after january 23. except for the different study population and period, we consider this result may be affected by the confounder. our research included the later phase of the epidemic and adjusted other demographic factors. this study also confirms that patients living in provincial capital that closely connected to the epicenter had a longer toh. this provides new demands on the epidemic prevention and control, that is, in provincial capital cities close to the epicenter, case tracking, surveillance and education of immediate admission/isolation should be emphasized. a mathematical model study showed that if the mean time from symptom onset to hospitalization can be halved by surveillance, then the probability that a case leads to transmission is very low (thompson, 2020) . interestingly, we found associations of clear republic of, 2020b). in addition, our results also found that the average los of 19 days will not decrease by early admission. perhaps it is related to the characteristics of the viral infectious disease. by contrast, the decrease in los in the later phase of the epidemic may be due to the continuous improvement of medical technology for this disease. this study included a large study cases during an entire epidemic and used a novel methodology. however, there are some limitations. first, as a retrospective study, since the date of symptom onset is self-reported based, there may be recall bias. second, although we made an effort to collect patient discharge information, we still could not obtain the discharge data of some patients. fortunately, nearly 90% of patients were discharged from the hospital at the end-point of observation on march 2, which provides an opportunity for the statistical methodology using survival data with left censoring. third, given the proportion of death cases in the study population was particularly small, which is less than 1%, the impact of death truncation was not considered when analyzing the length of hospitalization. finally, our study did not include the southeast provinces, but henan and zhejiang province were similar to those provinces in intensity of migration and scale of epidemic, and our results are also consistent with several studies conducted in shenzhen and hong kong in epidemiological characteristics during the same period (bi et al., 2020 , lai c. k. c. et al., 2020 . in patients' education about early admission or isolation should still be attached great importance in the future prevention and control, especially for the elderly living in provincial capital cities that were more closely connected with the epicenter. feng zhou: data collection, data analysis, writing. chong you: data collection, writing. xiaoyu zhang: data analysis. kaihuan qian: data collection. yan hou: data collection. yanhui gao: data analysis. xiao-hua zhou: study design. not required. the study was anonymous, and individual information was collected from provincial or municipal health commissions, which is a public data to help control this epidemic. no potential conflict of interest exits in the submission of this manuscript, and manuscript is approved by all authors for publication. panel a shows the frequency (blue histograms) and best-fitting probability density function (poisson, red curves) for time interval from symptom onset to hospitalization(≥0). panel b shows the frequency (blue histograms) and best-fitting probability density function (weibull, red curves) for length of hospital stay. j o u r n a l p r e -p r o o f j o u r n a l p r e -p r o o f j o u r n a l p r e -p r o o f baidu migration map epidemiology and transmission of covid-19 in 391 cases and 1286 of their close contacts in shenzhen, china: a retrospective cohort study clinical characteristics and treatment of critically ill patients with covid-19 in hebei update on pneumonia of new coronavirus infection as of 24:00 on national health commission of the people's republic of china.prevention and control plan for new coronavirus pneumonia the state council of the people's republic of china. the announcement on strengthening community prevention and control of pneumonia epidemic situation the analysis of count data: a gentle introduction to poisson regression and its alternatives dynamic variations of the covid-19 disease at different quarantine strategies in wuhan and mainland china rapid surveillance of covid-19 in the united states using a prospective space-time scan statistic: detecting and evaluating emerging clusters preparedness and vulnerability of african countries against importations of covid-19: a modelling study baseline characteristics and outcomes of 1591 patients infected with sars-cov-2 admitted to icus of the lombardy region, italy physical distancing interventions and incidence of coronavirus disease 2019: natural experiment in 149 countries the statistical analysis of failure time data epidemiological characteristics of covid-19: a systematic review and meta-analysis epidemiological characteristics of the first 53 laboratory-confirmed cases of covid-19 epidemic in hong kong epidemiological characteristics of the first 100 cases of coronavirus disease 2019 (covid-19) in hong kong special administrative region, china, a city with a stringent containment policy effect of non -pharmaceutical interventions to contain covid-19 in china early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia clinical characteristics and outcomes of hospitalised patients with covid-19 treated in hubei (epicentre) and outside hubei (non-epicentre): a nationwide analysis of china incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data space-time patterns, change, and propagation of covid-19 risk relative to the intervention scenarios in bangladesh epidemiological characteristics and incubation period of 7015 confirmed cases with coronavirus disease 2019 outside hubei province in china world health organization, who coronavirus disease (covid-19) dashboard. data last updated: 2020/9 world health organization, who director-general's opening remarks at the media briefing on covid-19 -11 survival analysis using sas: a practical guide, second edition factors associated with hospital admission and critical illness among 5279 people with coronavirus disease 2019 in new york city: prospective cohort study hospitalization and mortality among black patients and white patients with covid-19 presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with covid-19 in the new york city area effect of delay in diagnosis on transmission of covid-19 sas/stat 14.2 user's guide: the glimmix procedure (chapter novel coronavirus outbreak in wuhan, china, 2020: intense surveillance is vital for preventing sustained transmission in new locations characteristics of covid-19 infection in beijing characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention household transmissions of sars-cov-2 in the time of unprecedented travel lockdown in china epidemiological and clinical characteristics of 333 confirmed cases with coronavirus disease evolving epidemiology of novel coronavirus diseases 2019 and possible interruption of local transmission outside hubei province in china: a descriptive and modeling study the analysis of clinical characteristics of 34 novel coronavirus pneumonia cases in ningxia hui autonomous region symptoms fever # gender male 2430 2.84(3.53) 2(0-4) reference 654 19 non-family-confirmed patient contact history # unclear 3962 2.87(3.55) 2(0-5) reference we thank xueqing liu, yuying li j o u r n a l p r e -p r o o f j o u r n a l p r e -p r o o f key: cord-345877-rhybnlw0 authors: pei, lijun title: prediction of numbers of the accumulative confirmed patients (nacp) and the plateau phase of 2019-ncov in china date: 2020-04-27 journal: cogn neurodyn doi: 10.1007/s11571-020-09588-4 sha: doc_id: 345877 cord_uid: rhybnlw0 in the present study, i propose a novel fitting method to describe the outbreak of 2019-ncov in china. the fitted data were selected carefully from the non-hubei part and hubei province of china respectively. for the non-hubei part, the time period of data collection corresponds from the beginning of the policy of isolation to present day. but for hubei province, the subjects of wuhan city and hubei province were included from the time of admission to the huoshenshan hospital to present day in order to ensure that all or the majority of the confirmed and suspected patients were collected for diagnosis and treatment. the employed basic functions for fitting are the hyperbolic tangent functions [formula: see text] since in these cases the 2019-ncov is just an epidemic. subsequently, the 2019-ncov will initially expand rapidly and tend to disappear. therefore, the numbers of the accumulative confirmed patients in different cities, provinces and geographical regions will initially increase rapidly and subsequently stabilize to a plateau phase. the selection of the basic functions for fitting is crucial. in the present study, i found that the hyperbolic tangent functions [formula: see text] could satisfy the aforementioned properties. by this novel method, i can obtain two significant results. they base on the conditions that the rigorous isolation policy is executed continually. initially, i can predict the numbers very accurately of the cumulative confirmed patients in different cities, provinces and parts in china, notably, in wuhan city with the smallest relative error estimated to [formula: see text] , in hubei province with the smallest relative error estimated to [formula: see text] and in the non-hubei part of china with the smallest relative error of [formula: see text] 0.195% in the short-term period of infection. in addition, perhaps i can predict the times when the plateau phases will occur respectively in different regions in the long-term period of infection. generally for the non-hubei part of china, the plateau phase of the outbreak of the 2019-ncov will be expected this march or at the end of this february. in the non-hubei region of china it is expected that the epidemic will cease on the 30th of march 2020 and following this date no new confirmed patient will be expected. the predictions of the time of inflection points and maximum nacp for some important regions may be also obtained. a specific plan for the prevention measures of the 2019-ncov outbreak must be implemented. this will involve the present returning to work and resuming production in china. based on the presented results, i suggest that the rigorous isolation policy by the government should be executed regularly during daily life and work duties. moreover, as many as possible the confirmed and suspected cases should be collected to diagnose or treat. it has been suggested that coronaviruses are threats to human life. this type of viruses, which was discovered and characterized in 1965, is broadly distributed in mammals and birds. in humans, the majority of the coronaviruses cause only mild respiratory infections and a limited number, such as the ''severe acute respiratory syndrome'' (sars) in china (liu et al. 2004 ) and the ''middle east respiratory syndrome'' (mers) in saudi arabia and south korea, have caused more than 10,000 cumulative infected cases in the past 2 decades. although several coronaviruses have been identified and characterized, additional unknown coronaviruses that are potential threats are to be discovered. in december 2019, pneumonia cases of unknown reasons emerged in wuhan, the capital of hubei province and one of the largest cities in the central part of china. although the majority of them were cured, they led to respiratory failures and a few patient fatalities. this outbreak of pneumonia attracted significant attention in the world. the causative agent identified by the chinese authorities was designated 2019 novel coronavirus (2019-ncov) by the world heath organization (who) on january 10 2020. on january 20 2020, the chinese government classified the novel 2019-ncov as a class a agent. a series of non-pharmaceutical interventions were implemented, namely, isolation of symptomatic persons, strict restriction of travel in hubei province and shutdown of the public transport in various cities. although the number of the accumulative confirmed patients (nacp) in the non-hubei chinese regions have decreased continually for 7 days, the effectiveness and efficiency of these interventions is questionable. in addition, when the viral infection reaches its plateau phases several factors are affected including financial costs and work abstention. therefore, it is very crucial to reduce the outbreak of the 2019-ncov. so far, there are nearly 20,000 confirmed cases in wuhan and more than 40,000 confirmed cases in china, whereas several exported cases have been confirmed in other countries including japan, south korea, singapore, usa, canada, germany, france, uk and spain. mathematical models were employed to investigate the viral outbreak and interesting results were obtained. the mathematical modeling of the 2019 n-cov outbreak has been previously investigated tang et al. 2020; rabajante 2020; imai et al. 2020; liu et al. 2020; fanelli and piazza 2020; peng et al. 2020; toda 2020; sameni 2020) . the reason of the outbreak has bee reported in previous studies (chu et al. 2020; khan et al. 2020a, b) . its clinical characteristics and laboratory test results were also studied (qian et al. 2020; world health organization 2020) . its treatment and prognosis were presented in two recent studies (chen and du 2020; chai et al. 2020 ). its containment strategy was discussed in (bittihn and golestanian 2020; hu et al. 2020) . the prediction of the tendency of 2019-ncov and notably of the nacp and of the plateau phase is of great importance at present. these goals were achieved by fitting the data of the nacp in these regions. the 2019-ncov outbreak could not be modeled accurately due to the weak knowledge of the reasons, transmission mechanisms, effect of control policies, treatments strategies and damages. the mechanism of 2019-ncov infection is very unclear and was studied by several scientists. however, the data contains its much information and can disclose its many natures. therefore, these data were fitted in order to conduct the outbreak prediction. two significant features were stated in my novel fitting method as follows: • the first novel idea is the choice of the data of the nacp in different regions. for the china regions outside hubei province, i.e., the non-hubei part, the medical conditions are sufficient and the isolation policy is well executed. all the confirmed patients can be collected and receive treatment, and the suspected cases can be collected for diagnosis and further treatment. therefore, the outbreak of the 2019-ncov is just a general epidemic. for hubei province, which includes the wuhan city, several cabin hospitals and the leishenshan hospital were employed following initial completeness of the huoshenshan hospital. the majority of the confirmed patients can be collected for treatment and most suspected patients can also be collected for diagnosis and subsequent treatment. therefore, the outbreak of the 2019-ncov in wuhan city and hubei province are considered as a general epidemic and can be treated as such. the data were collected from the non-hubei part of china from approximately january 20 2020, i.e., from the beginning of the policy of isolation to the present day. the data were collected from hubei province and from wuhan city from february 6 2020, i.e., from the date of the initial establishment of the huoshenshan hospital, to the present date. the data can be fitted to predict the nacp in the short-term duration and predict the initiation of the plateau phases. • the selection of the basic functions for the fitting model is crucial for the success of the prediction. since in both the above cases, the 2019-ncov is just an epidemic, this suggests that it will initially spread rapidly and subsequently exhibit a tendency to disappear. therefore, the numbers of the cumulative confirmed patients in different cities, provinces and geographical locations will initially increase rapidly and subsequently remain constant when reaching the plateau phase of the viral infection. in the present study, the hyperbolic tangent functions tanhð:þ were used that can satisfy the aforementioned conditions. therefore, the hyperbolic tangent function tanhð:þ was set as the basic function for fitting. it laid the foundation of the success of the fitting model and further enhanced the prediction success of the 2019-ncov infection. by this novel method, two significant results were obtained based on the conditions that the rigorous isolation policy is cognitive neurodynamics executed continually. initially, the numbers of the accumulative confirmed patients in different cities, provinces and geographical locations in china were predicted very accurately in the short term period of infection. moreover, the times of the plateau phases were determined in different places in the long-term period of infection. generally, in the non-hubei china part, the nacp of 2019-ncov will tend to constant from approximately february 23 2020 and its maximum infectivity will be theoretically achieved by march 30 2020. following this date, no additional infected patient will be expected to be diagnosed. based on the present results, it is suggested that the rigorous isolation policy by the government should be executed continually. the remaining part of this article is organized as follows: in sect. 2, the novel fitting method of the outbreak of 2019-ncov in china is proposed, and the selection of the data and basic functions is presented. the validation of this novel method was achieved in the data derived from the sars infection in 2003 in china mainland and hongkong, which are presented in sect. 3. the results of the prediction of nacp and of the plateau phase, as well as of the ips of 2019-ncov in china are presented in sect. 4. finally, i present some concluding remarks in sect. 5. initially, i will present the novel fitting method for the prediction of nacp and the plateau phase of 2019-ncov in china. the success of the fitting or prediction depends on the selection of the data, the basic functions for this fitting and the fitting method. i will describe all three components in this section. the data of the nacp must correspond to the epidemic characteristics. they must fit into the epidemic pattern. all the confirmed patients must be collected for treatment and the suspected patients can be collected for diagnosis. the effective treatment of the infected patients and the efficient diagnosis of the suspected cases should be ensured. this is the basis of the principle to which the government adheres regarding as many as possible the confirmed and suspected should be collected to diagnose and treat. with regard to the chinese regions outside of hubei province, i.e., the non-hubei part of china, the medical conditions are sufficient to treat the infected cases and ultimately contain the spread of the virus. therefore the confirmed patients can be collected for the appropriate treatment and the suspected patients can be collected for diagnosis and further treatment, so that the outbreak of the 2019-ncov in this part will be considered as a general epidemic. therefore the data from the 21st of january 2020, which was the beginning of the strict isolation policy in the non-hubei region were used for fitting. the data of the nacp in cities and provinces with major viral outbreak are presented in tables 1 and 2. all data are collected from the official websites of the health commissions in these regions. the hubei province, which includes wuhan city, was thoroughly assessed. following the establishment of the huoshenshan hospital, the cases reported in several cabin hospitals and subsequently in the leishenshan hospital were examined. the majority of the confirmed patients were collected for successful treatment, and most of the suspected patients were collected for diagnosis and further treatment. therefore, the outbreak of the 2019-ncov in wuhan city and even hubei province is considered a general epidemic, suggesting that it should be treated as such. the data from wuhan city and hubei province were collected from the establishment of the huoshenshan hospital to the present date to ensure that all or the majority of the confirmed and suspected cases could be collected for diagnosis and treatment. the data of the nacp in wuhan and hubei province are presented in table 3 . all the data were collected from the official websites of the health commissions in hubei province and wuhan city. a notable change was noted on february 12 2020. the numbers of the cumulative clinical confirmed patients were also added to those of the cumulative confirmed patients in hubei province and in wuhan city. more than 10,000 cases of this type were added into the data collected by february the 12th. therefore, the present data are differentiated from the restrictive confirmed standards to the relaxed confirmed standards. however, the data were collected based on the old restrictive confirmed standards. in future studies, the data should be fitted to the new relaxed confirmed standards of hubei province and wuhan city in order to predict the number of the accumulative confirmed patients and the plateau phase of the 2019-ncov infection. these data are subsequently fitted to the following novel basic functions. i will fit these data by the following novel basic functions. usually, the power functions 1; x; x 2 ; x 3 ; x 4 ; . . . are used as the basic functions for fitting the data. however, the data of the infectious diseases require a different set of basic functions. despite fitting of the data, a usual fluctuation may be noted and consequently the tendency-like plateau phase can not be predicted. therefore, the selection of the basic functions for this fitting is crucial. the infectious diseases are characterized by the numbers of the cumulative confirmed patients. they initially increase rapidly and finally stabilize. at that time period no confirmed patient presents. following isolation and treatment, the majority of the confirmed patients will recover and a limited number will not survive the infection. at last, the infectious diseases are controlled and eradicated. it is well known that the hyperbolic tangent functions tanhð:þ exhibit two properties: an initial rapid increase and a final phase with constant. based on these two properties, the functions are rearranged to: 1; tanhð0:1xþ; tanhð0:2xþ; tanhð0:3xþ; . . . or 1; tanhð0:2xþ; tanhð0:4xþ; tanhð0:6xþ; . . . as the basic functions for fitting the data of the nacp in these regions. the accuracy of the fitting by this novel method is excellent since not only it can predict the nacp in these regions in the next day with very small relative errors, but it can also plot the evolution curves of the nacp in the long-run period of infection and perhaps it can be used to estimate the days when the plateau phase comes. initially, the novel fitting method was used for estimation of the nacp. the number of infected cases per day can be predicted in some regions by the novel fitting method and the fitting function. a plot can be constructed and the days required for the plateau phase can be estimated. for example, in nanyang city, a serious outbreak city that is close to wuhan city in the henan province, the restrictive isolation was executed from january 25 to february 7 (tables 1, 2). the data were fitted using the following basic functions: 1; tanhð0:1xþ; tanhð0:2xþ; tanhð0:3xþ; . . .. the equation was rearranged as follows, f ðxþ ¼ 6:4056 þ 308:92 tanhð0:1xþ à 261:789 tanhð0:2xþ þ 164:811 tanhð0:3xþ à 65:3425 tanhð0:4xþ: the number of cases on february 8 2020 was f ð16þ ¼ 129:653 % 130. the actual number was 133 and the relative error was à2:26%. the fitting results are shown in fig. 6a . the fitting was optimal. the prediction, which is described by the evolution curve is displayed in fig. 6b . obviously it will be constant from february 22 2020. shanghai city is a very important international city in china and the nacp from january 20 to february 7 (tables 1, 2) following execution of the restrictive isolation could be fitted in the basic functions: the number estimated in february 8, 2020 was f ð20þ ¼ 290:174 % 290. the actual number was 292 and the relative error was à0:685%. the fitting result is shown in fig. 3a . the fitting was optimal. the prediction that contained the evolution curve is displayed in fig. 3b . it is expected that it will tend to be constant from approximately february 20, 2020. another example can be obtained for the non-hubei region of china part. the nacp in that region was reported from january 25 to february 7 (tables 1, 2) , when the strict isolation was executed. the data could be fitted with the basic functions as follows: 1; tanhð0:1xþ; tanhð0:2xþ; tanhð0:3xþ; . . .. the following formula was obtained, f ðxþ ¼ 959:537 þ 35049:5 tanhð0:1xþ à 77153:2 tanhð0:2xþ þ 169568: tanhð0:3xþ à 243464: tanhð0:4xþ þ 187331: tanhð0:5xþ à 59145:8 tanhð0:6xþ: the number estimated on february 8 2020 was f ð15þ ¼ 10162:9 % 10; 163. the actual number was 10,098, and the relative error was 0:64%. the fitting result is shown in fig. 8a . the fitting was excellent. the prediction, i.e., the evolution curve is displayed in fig. 8b . it is expected for this curve to stabilize from the 23th of february 2020. all fittings and the evolution curves for the different regions are presented in figs. 2, 3, 4, 5, 6, 7, 8, 9 and 10. with regard to the prediction of the next day, i.e. on the 9th of february 2020, the method is similar. the actual number on the 8th of february 2020 was added into the old data and fitted with the new method and the above basic functions. the fitting functions could be obtained and the number of infected cases on february 9 2020 could also be obtained. the fitting results and the evolution curves were obtained. the prediction of nacp in the continuous days is presented in tables 4 and 5. initially, this novel fitting method was employed to the fitting and prediction of the data of sars in china mainland and hongkong in 2003 to assess the effectiveness of this method. the results are presented in fig. 1 . apparently, the fitting was excellent and could predict the evolution of sars in china mainland and in hongkong in 2003 qualitatively and quantitatively. the data of sars in china mainland were obtained from the official website of who (world health organization) from april 21 2003 to july 11 2003 (https://www.who.int/csr/sars/country/en/). the data from the hongkong were obtained from the website from the march 17 2003 to the 1st of july 2003 (https://www.who.int/csr/sars/country/en/). the data implied that the novel fitting method is excellent for sars in 2003 and perhaps valid and effective for the 2019-ncov. in the next section, this fitting method was employed to the prediction of nacp and the plateau phase of 2019-ncov in china. in this subsection, the prediction of nacp will be presented in the continuous days at different regions in tables 4 and 5 . all fittings and the evolution curves for different regions are represented in figs. 2, 3, 4, 5, 6, 7, 8, 9 and 10. in this subsection, the inflection points (ip) in different chinese regions are presented. ip stands for the maximum of nacp that was achieved. after this point, the infectious disease will be controlled and the majority of the confirmed patients will recover, whereas a low number of cases will succumb to the disease. the prediction of this point is of great importance to the inhibition of the 2019-ncov outbreak. in the present study, the prediction of ips in the 2019-ncov outbreak in different regions of china is presented (figs. 11, 12, 13 and tables 6, 7, 8, 9) . these regions are usually important cities or the main affected regions around hubei province. it is of great importance to investigate the date of the 2019-ncov termination. it is very helpful for the prevention of 2019-ncov outbreak to construct a plan for their daily life, returning to work and resuming production by the government. it can be deduced that the cities in the mainland china outside of hubei province should be able to control the outbreak of 2019-ncov before march 30 2020 and after this date no confirmed patient should be reported. the nearby regions, especially the severe outbreak regions, such as henan, hunan, jiangxi and anhui provinces, the important industrial and financial regions, such as guangdong, zhejiang, jiangsu provinces and the important international figs. 11, 12, 13 and tables 6, 7, 8, 9. since the medical conditions and curative efficiency are being improved, the nacp of hubei province and wuhan city can not represent the real number of the infected subjects. thus, the nacp or ip can not be predicted effectively. but in this paper, i have tried to predict the nacps and ips for wuhan city and hubei province. the results of nacps are excellent also and presented in tables 10, 11 and 12. the results for ips in these two regions are presented in fig. 13 ðy à zþ and table 6 . based on the fact that the rigorous isolation policy is executed continually, the results of the present study are very significant. firstly, the numbers of the infected subjects can be predicted very accurately leading to the cumulative number of confirmed patients in different regions of china. notably, hubei province exhibited the smallest relative errors (0:012%), followed by wuhan city (0:021%). in addition, the non-hubei chinese region exhibited larger relative errors (à0:195%) in the short-term period of infection. moreover, it is possible to predict the time points when the plateau phases are developed in different regions in the long-run period of infection. it was generally shown for the non-hubei chinese region that the nacp of the 2019-ncov infection would tend to reach a constant state of growth from february 23 2020. this evidence is considered very important for the fighting against the 2019-ncov outbreak in china. recently, there are increasing imported confirmed patients in the international cities such as beijing, shanghai, guangzhou since more and more chinese peoples returned back from the other countries suffering from the worse outbreak of 2019-ncov. since the numbers of the coming back peoples from these countries in everyday and the rates of infection in these countries are different, so the number of the imported confirmed patients are stochastic. it is not similar to the situation of china mainland. it has no any relation to the outbreak of 2019-ncov in china mainland. i can not predict it accurately by the present method in this paper. i will study it in the separate paper in the future. it is a challenge of the fighting against the 2019-ncov in china. based on these results, it is suggested that not only the rigorous isolation policy by the government should be executed continually, but also the concomitant diagnosis and treatment of-as many as possible-confirmed and suspected cases should be facilitated. this will speed up the conquer of the viral outbreak. in the present study, the novel fitting method was employed to predict the nacp and the plateau phase of the 2019-ncov infection in different regions of china. the data were collected during different time periods of infection occurring in different regions of china to ensure that as many as possible confirmed and suspected cases could be collected for diagnosis or treatment. the hyperbolic tangent functions were used as the basic functions for the fitting method. two significant results were obtained as follows: firstly, the nacp could be predicted very accurately in different regions of china, notably in wuhan city and hubei province with very small relative errors. in the non-hubei region of china larger relative errors were noted in the short-term period of infection. secondly, the time point at which the plateau phases occur can be predicted in different regions in the long-run period of infection. generally for the non-hubei chinese regions, the plateau phase of 2019-ncov was noted at approximately march 30 2020 and after this time period no new confirmed patients were identified. the predictions of the time of inflection points (ips) and maximum nacp for certain important regions were also presented. these measures are very important for the prevention of the 2019-ncov outbreak and for returning to work and resuming production in their daily life and work duties in china. based on these results, it is suggested that the rigorous isolation policy should be executed by the containment strategy for an epidemic based on fluctuations in the sir model specific ace2 expression in cholangiocytes may cause liver damage after 2019-ncov infection a time delay dynamical model for outbreak of 2019-ncov and the parameter identification potential natural compounds for preventing 2019-ncov infection molecular diagnosis of a novel coronavirus (2019-ncov) causing an outbreak of pneumonia analysis and forecast of covid-19 spreading in china forecasting and evaluating intervention of covid-19 in the world estimating the potential total number of novel coronavirus cases in wuhan city novel coronavirus is putting the whole world on alert studies on mathematical models for sars outbreak prediction and warning predicting the cumulative number of cases for the covid-19 epidemic in china from early data epidemic analysis of covid-19 in china by dynamical modeling clinical characteristics of 2019 novel infected coronavirus pneumonia: a systemic review and meta-analysis rabajante fj (2020) insights from early mathematical models of 2019-ncov acute respiratory disease (covid-19) dynamics mathematical modeling of epidemic diseases estimation of the transmission risk of the 2019-ncov and its implication for public health interventions susceptible-infected-recovered (sir) dynamics of covid-19 and economic impact world health organization (2020) laboratory testing of 2019 novel coronavirus (2019-ncov?) in suspected human cases: interim guidance acknowledgements the author would like to acknowledge the financial support for this research via the nnsf of china under the grant nos. 11972327, 11372282 and 10702065. the author also thank prof. rubin wang, editor-in-chief of cognitive neuro-dynamics, for his sufficient discussions, valuable suggestions and kind help. all best wishes for all citizens in hubei province, esp. wuhan city, and the doctors, nurses, scientists and plas there from all over china helping them. key: cord-327096-m87tapjp authors: peng, liangrong; yang, wuyue; zhang, dongyan; zhuge, changjing; hong, liu title: epidemic analysis of covid-19 in china by dynamical modeling date: 2020-02-18 journal: nan doi: 10.1101/2020.02.16.20023465 sha: doc_id: 327096 cord_uid: m87tapjp the outbreak of the novel coronavirus (2019-ncov) epidemic has attracted worldwide attention. herein, we propose a mathematical model to analyzes this epidemic, based on a dynamic mechanism that incorporating the intrinsic impact of hidden latent and infectious cases on the entire process of transmission. meanwhile, this model is validated by data correlation analysis, predicting the recent public data, and backtracking, as well as sensitivity analysis. the dynamical model reveals the impact of various measures on the key parameters of the epidemic. according to the public data of nhcs from 01/20 to 02/09, we predict the epidemic peak and possible end time for 5 different regions. the epidemic in beijing and shanghai, mainland/hubei and hubei/wuhan, are expected to end before the end of february, and before midmarch respectively. the model indicates that, the outbreak in wuhan is predicted to be ended in the early april. as a result, more effective policies and more efforts on clinical research are demanded. moreover, through the backtracking simulation, we infer that the outbreak of the epidemic in mainland/hubei, hubei/wuhan, and wuhan can be dated back to the end of december 2019 or the beginning of january 2020. a novel coronavirus, formerly called 2019-ncov, or sars-cov-2 by ictv (severe acute respiratory syndrome coronavirus 2, by the international committee on taxonomy of viruses) caused an outbreak of atypical pneumonia, now officially called covid-19 by who (coronavirus disease 2019, by world health organization) first in wuhan, hubei province in dec., 2019 and then rapidly spread out in the whole china 1 . as of 24:00 feb. 13th, 2020 (beijing time), there are over 60, 000 reported cases (including more than 1, 000 death report) in china, among which, over 80% are from hubei province and over 50% from wuhan city, the capital of hubei province 2,3 . the central government of china as well as all local governments, including hubei, has tightened preventive measures to curb the spreading of covid-19 since jan. 2020. many cities in hubei province have been locked down and many measures, such as tracing close contacts, quarantining infected cases, promoting social consensus on self-protection like wearing face mask in public area, etc. however, until the finishing of this manuscript, the epidemic is still ongoing and the daily confirmed cases maintain at a high level. during this anti-epidemic battle, besides medical and biological research, theoretical studies based on either statistics or mathematical modeling may also play a non-negligible role in understanding the epidemic characteristics of the outbreak, in forecasting the inflection point and ending time, and in deciding the measures to curb the spreading. for this purpose, in the early stage many efforts have been devoted to estimate key epidemic parameters, such as the basic reproduction number, doubling time and serial interval, in which the statistics models are mainly used [4] [5] [6] [7] [8] [9] . due to the limitation of detection methods and restricted diagnostic criteria, asymptomatic or mild patients are possibly excluded from the confirmed cases. to this end, some methods have been proposed to estimate untraced contacts 10 , undetected international cases 11 , or the actual infected cases in wuhan and hubei province based on statistics models 12 , or the epidemic outside hubei province and overseas 6, [13] [14] [15] . with the improvement of clinic treatment of patients as well as more strict methods stepped up for containing the spread, many researchers investigate the effect of such changes by statistical reasoning 16, 17 and stochastic simulation 18, 19 . compared with statistics methods 20,21 , mathematical modeling based on dynamical equations 15,22-24 receive relatively less attention, though they can provide more detailed 2 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint mechanism for the epidemic dynamics. among them, the classical susceptible exposed infectious recovered model (seir) is the most widely adopted one for characterizing the epidemic of covid-19 outbreak in both china and other countries 25 . based on seir model, one can also assess the effectiveness of various measures since the outbreak 23, 24, [26] [27] [28] , which seems to be a difficult task for general statistics methods. seir model was also utilized to compare the effects of lock-down of hubei province on the transmission dynamics in wuhan and beijing 29 . as the dynamical model can reach interpretable conclusions on the outbreak, a cascade of seir models are developed to simulate the processes of transmission from infection source, hosts, reservoir to human 30 . there are also notable generalizations of seir model for evaluation of the transmission risk and prediction of patient number, in which model, each group is divided into two subpopulations, the quarantined and unquarantined 23, 24 . the extension of classical seir model with delays 31,32 is another routine to simulate the incubation period and the period before recovery. however, due to the lack of official data and the change of diagnostic caliber in the early stage of the outbreak, most early published models were either too complicated to avoid the overfitting problem, or the parameters were estimated based on limited and less accurate data, resulting in questionable predictions. in this work, we carefully collect the epidemic data from the authoritative sources: the such a design aims to minimize the influence of hubei province and wuhan city on the data set due to their extremely large infected populations compared to other regions. without further specific mention, these conventions will be adopted thorough the whole paper. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint in progress. a. generalized seir model {s(t), p (t), e(t), i(t), q(t), r(t), d(t)} denoting at time t the respective number of the susceptible cases, insusceptible cases, exposed cases (infected but not yet be infectious, in a latent period), infectious cases (with infectious capacity and not yet be quarantined), quarantined cases (confirmed and infected), recovered cases and closed cases (or death). the adding of a new quarantined sate is driven by data, which together with the recovery state takes replace of the original r state in the classical seir model. their relations are given in fig. 1 and characterized by a group of ordinary differential equations (or difference equations if we consider discrete time, see si). constant n = s + p + e + i + q + r + d is the total population in a certain region. the coefficients {α, β, γ −1 , δ −1 , λ(t), κ(t)} represent the protection rate, infection rate, average latent time, average quarantine time, cure rate, and mortality rate, separately. especially, to take the improvement of public health into account, such as promoting wearing face masks, more effective contact tracing and more strict locking-down 4 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint of communities, we assume that the susceptible population is stably decreasing and thus introduce a positive protection rate α into the model. in this case, the basic reproduction it is noted that here we assume the cure rate λ and the mortality rate κ are both time dependent. as confirmed in fig. 2a -d, the cure rate λ(t) is gradually increasing with the time, while the mortality rate κ(t) quickly decreases to less than 1% and becomes stabilized after jan. 30th. this phenomenon is likely raised by the assistance of other emergency medical teams, the application of new drugs, etc. furthermore, the average contact number of an infectious person is calculated in fig. 2e-f and could provide some clue on the infection rate. it is clearly seen that the average contact number is basically stable over time, but shows a remarkable difference among various regions, which could be attributed to different quarantine policies and implements inside and outside hubei (or wuhan), since a less severe region is more likely to inquiry the close contacts of a confirmed case. a similar regional difference is observed for the severe condition rate too. in fig. 2g -h, hubei and wuhan overall show a much higher severe condition rate than shanghai. although it is generally expected that the patients need a period of time to become infectious, to be quarantined, or to be recovered from illness, but we do not find a strong evidence for the necessity of including time delay (see si for more details). as a result, the time-delayed equations are not considered in the current work for simplicity. according to the daily official reports of nhc of china, the cumulative numbers of quarantined cases, recovered cases and closed cases are available in public. however, since the latter two are directly related to the first one through the time dependent recovery rate and mortality rate, the numbers of quarantined cases q(t) plays a key role in our modeling. a similar argument applies to the number of insusceptible cases too. furthermore, as the accurate numbers of exposed cases and infectious cases are very hard to determine, they will be treated as hidden variables during the study. leaving alone the time dependent parameters λ(t) and κ(t), there are four unknown coefficients {α, β, γ −1 , δ −1 } and two initial conditions {e 0 , i 0 } about the hidden variables (other initial conditions are known from the data) have to be extracted from the time series 5 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16 data {q(t)}. such an optimization problem could be solved automatically by using the simulating annealing algorithm (see si for details). a major difficulty is how to overcome the overfitting problem. to this end, we firstly prefix the latent time γ −1 , which is generally estimated within several days 5, 33, 34 . and then for each fixed γ −1 , we explore its influence on other parameters (β = 1 nearly unchanged), initial values, as well as the population dynamics of quarantined cases and infected cases during best fitting. from fig. 3a -b, to produce the same outcome, the protection rate α and the reciprocal of the quarantine time δ −1 are both decreasing with the latent time γ −1 , which is consistent with the fact that longer latent time requires longer quarantine time. meanwhile, the initial values of exposed cases and infectious cases are increasing with the latent time. since e 0 and i 0 include asymptomatic patients, they both should be larger than the number of quarantined cases. furthermore, as the time period between the starting date of our simulation (jan. 20th) and the initial outbreak of covid-19 (generally believed to be earlier than jan. 1st) is much longer than the latent time (3-6 days), e 0 and i 0 have to be close to each other, which makes only their sum e 0 +i 0 6 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint matters during the fitting. an additional important finding is that in all cases β is always very close to 1, which agrees with the observation that covid-19 has an extremely strong infectious ability. nearly every unprotected person will be infected after a direct contact with the covid-19 patients 5,33,34 . as a summary, we conclude that once the latent time γ −1 is fixed, the fitting accuracy on the time series data {q(t)} basically depends on the values of α, δ −1 and e 0 + i 0 . and based on a reasonable estimation on the total number of infected cases (see fig. 3c-d) , the latent time is finally determined as 2 days. in order to further evaluate the influence of other fitting parameters on the long-term forecast, we perform sensitivity analysis on the data of wuhan (results for other regions are similar and not shown) by systematically varying the values of unknown coefficients 35, 36 . as shown in fig. 3e-f , the predicted total infected cases at the end of epidemic, as well as the the inflection point, at which the basic reproduction number is less than 1 6 , both show a positive correlation with the infection rate β and the quarantined time δ −1 and a negative correlation with the protection rate α. these facts agree with the common sense and highlight the necessity of self-protection (increase α and decrease β), timely disinfection (increase α and decrease β), early quarantine (decrease δ −1 ), etc. an exception is found for the initial total infected cases. although a larger value of e 0 + i 0 could substantially increase the final total infected cases, it shows no impact on the inflection point, which could be learnt from the formula of basic reproduction number. we apply our pre-described generalized seir model to interpret the public data on the cumulative numbers of quarantined cases, recovered cases and closed cases from jan. 20th to feb. 9th, which are published daily by nhc of china since jan. 20th. our preliminary study includes five different regions, i.e. the mainland * , hubei * , wuhan, beijing and shanghai. through extensive simulations, the optimal values for unknown model parameters and 7 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16 initial conditions, which best explain the observed cumulative numbers of quarantined cases, recovered cases and closed cases (see fig. 4 ), are determined and summarized in table 1 . there are several remarkable facts could be immediately learnt from table 1 . firstly, the protection rate of wuhan is significantly lower than other regions, showing many infected cases may not yet be well quarantined until feb. 9th (the smaller α for wuhan does not necessarily mean people in wuhan pay less attention to self-protection, but more likely due to the higher mixing ratio of susceptible cases with infectious cases). similarly, although the average protection rate for hubei * is higher than that of wuhan, it is still significantly lower than other regions. secondly, the quarantine time for beijing and shanghai are the 8 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint shortest, that for mainland * is in between. again, the quarantine time for wuhan and hubei * are the longest. finally, the estimated number of total infected cases on jan. 20th in five regions are all significantly larger than one, suggesting the covid-19 has already spread out nationwide at that moment. we will come back to this point in the next part. the initial values for exposed cases and infectious cases separately. the time-dependent cure rate λ(t) and mortality rate κ(t) can be read out from fig. 2 and are given in si. most importantly, with the model and parameters in hand, we can carry out simulations for a longer time and forecast the potential tendency of the covid-19 epidemic. in fig. 4 and fig. 5a -b, the predicted cumulative number of quarantined cases and the current number of exposed cases plus infectious cases are plotted for next 30 days as well as for a shorter period of next 13 days. official published data by nhc of china from feb. 10th to 15th are marked in red spots and taken as a direct validation. overall, except wuhan, the validation data show a well agreement with our forecast and all fall into the 95% confidence interval (shaded area). and we are delighted to see most of them are lower than our predictions, showing the nationwide anti-epidemic measures in china come into play. while for wuhan city (and also hubei province), due to the inclusion of suspected cases with clinical diagnosis into confirmed cases (12364 cases for wuhan and 968 cases for hubei * on feb. 12th) announced by nhc of china since feb. 12th during the preparation of our manuscript, there is a sudden jump in the quarantined cases. although it to some 9 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint extent offsets our original overestimates, it also reveals the current severe situation in wuhan city, which requires much closer attention in the future. towards the epidemic of covid-19, our basic predictions are summarized as follows: 1. based on optimistic estimation, the epidemic of covid-19 in beijing and shanghai would soon be ended within two weeks (since feb. 15th). while for most parts of mainland, the success of anti-epidemic will be no later than the middle of march. the situation in wuhan is still very severe, at least based on public data until feb. 15th. we expect it will end up at the beginning of april. are not included into parameter estimation). by coincidence, on the same day, we witnessed a sudden jump in the number of confirmed cases due to a relaxed diagnosis caliber, meaning more suspected cases will receive better medical care and have much lower chances to spread virus. besides, wuhan local government announced the completion of community survey on all confirmed cases, suspected cases and close contacts in the whole city. besides the forecast, the early trajectory of the covid-19 outbreak is also critical for our understanding on its epidemic as well as future prevention. to this end, by adopting the shooting method, we carry out inverse inference to explore the early epidemic dynamics 10 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint of covid-19 since its onset in mainland * , hubei * , and wuhan (beijing and shanghai are not considered due to their too small numbers of infected cases on jan. 20th). with respect to the parameters and initial conditions listed in table 1 , we make an astonishing finding that, for all three cases, the outbreaks of covid-19 all point to 20-25 days before jan. 20th (the starting date for public data and our modeling). it means the epidemic of covid-19 in these regions is no later than jan. 1st (see fig. 5d ), in agreement with reports by li et al. 5, 33, 34 . and in this stage (from jan. 1st to jan. 20th), the number of total infected cases follows a nice exponential curve with the doubling time around 2 days. this in some way explains why statistics studies with either exponential functions or logistic models could work very well on early limited data points. furthermore, we notice the number of infected cases based on inverse inference is much larger than the reported confirmed cases in wuhan city before jan. 20th. in this study, we propose a generalized seir model to analyze the epidemic of covid-19, which was firstly reported in wuhan last december and then quickly spread out nationwide in china. our model properly incorporates the intrinsic impact of hidden exposed and infectious cases on the entire procedure of epidemic, which is difficult for traditional statistics analysis. a new quarantined state, together with the recovery state, takes replace of the original r state in the classical seir model and correctly accounts for the daily reported confirmed infected cases and recovered cases. based on detailed analysis of the public data of nhc of china from jan. 20th to feb. 9th, we estimate several key parameters for covid-19, like the latent time, the quarantine time and the basic reproduction number in a relatively reliable way, and predict the inflection point, possible ending time and final total infected cases for hubei, wuhan, beijing, shanghai, etc. overall, the epidemic situations for beijing and shanghai are optimistic, which are expected to end up within two weeks (from feb. 15th, 2020). meanwhile, for most parts of mainland including the majority of cities in hubei province, it will be no later than the middle of march. we should also point out that the situation in wuhan city is still very severe. more effective policies and more efforts on medical care and clinical research are eagerly needed. we expect the final success of anti-epidemic will be reached at the beginning 11 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint of this april. furthermore, by inverse inference, we find that the outbreak of this epidemic in mainland, hubei, and wuhan can all be dated back to 20-25 days ago with respect to jan. 20th, in other words the end of dec. 2019, which is consistent with public reports. although we lack the knowledge on the first infected case, our inverse inference may still be helpful for understanding the epidemic of covid-19 and preventing similar virus in the future. the authors declare no conflict of interest. epidemic doubling time of the 2019 novel coronavirus outbreak by province in mainland china. medrxiv epidemiological and clinical features of the 2019 novel coronavirus outbreak in china preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china the novel coronavirus, 2019-ncov, is highly contagious and more infectious than initially estimated. medrxiv serial interval of novel coronavirus (2019-ncov) infections. medrxiv assessing spread risk of wuhan novel coronavirus within and beyond china all rights reserved. no reuse allowed without permission author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the using predicted imports of 2019-ncov cases to determine locations that may not be identifying all imported cases. medrxiv epidemic size of novel coronavirusinfected pneumonia in the epicenter wuhan: using data of five-countries' evacuation action. medrxiv estimating the daily trend in the size of covid-19 infected population in wuhan. medrxiv estimation of the asymptomatic ratio of novel coronavirus (2019-ncov) infections among passengers on evacuation flights early dynamics of transmission and control of 2019-ncov: a mathematical modelling study. medrxiv the effect of travel restrictions on the spread of the 2019 novel coronavirus (2019-ncov) outbreak. medrxiv the impact of traffic isolation in wuhan on the spread of 2019-ncov. medrxiv feasibility of controlling 2019-ncov outbreaks by isolation of cases and contacts effectiveness of airport screening at detecting travellers infected with 2019-ncov. medrxiv predictions of 2019-ncov transmission ending via comprehensive methods all rights reserved. no reuse allowed without permission author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the a data driven time-dependent transmission rate for tracking an epidemic: a case study of 2019-ncov novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions. medrxiv estimation of the transmission risk of the 2019-ncov and its implication for public health interventions an updated estimation of the risk of transmission of the novel coronavirus (2019-ncov). infectious disease modelling transmission dynamics of 2019-ncov in malaysia. medrxiv lockdown may partially halt the spread of 2019 novel coronavirus in hubei province interventions targeting air travellers early in the pandemic may delay local outbreaks of sars-cov-2. medrxiv simulating the infected population and spread trend of 2019-ncov under different policy by eir model. medrxiv the lockdown of hubei province causing different transmission dynamics of the novel coronavirus (2019-ncov) in wuhan and beijing. medrxiv jing-an cui, and ling yin. a mathematical model for simulating the transmission of wuhan novel coronavirus. biorxiv a time delay dynamical model for outbreak of 2019-ncov and the parameter identification modeling and prediction for the trend of outbreak of ncp based on a time-delay dynamic system all rights reserved. no reuse allowed without permission author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the partial equilibrium approximations in apoptosis. ii. the death-inducing signaling complex subsystem chiu fan lee, and ya jing huang. statistical mechanics and kinetics of amyloid fibrillation we acknowledged the financial supports from the national natural science foundation all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity.the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16 author/funder, who has granted medrxiv a license to display the preprint in perpetuity.the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16 17 all rights reserved. no reuse allowed without permission.author/funder, who has granted medrxiv a license to display the preprint in perpetuity.the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint 18 all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity.the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02. 16.20023465 doi: medrxiv preprint key: cord-339743-jxj10857 authors: liu, h.; bai, x.; shen, h.; pang, x.; liang, z.; liu, y. title: synchronized travel restrictions across cities can be effective in covid-19 control date: 2020-04-06 journal: nan doi: 10.1101/2020.04.02.20050781 sha: doc_id: 339743 cord_uid: jxj10857 mobility control measures are of crucial importance for public health planning in combating the covid-19 pandemic. previous studies established the impact of population outflow from wuhan on the spatial spread of coronavirus in china and hinted the impact of the other three mobility patterns, i.e., population outflow from hubei province excluding wuhan, population inflow from cities outside hubei, and intra-city population movement. however, the overall impact of all mobility patterns, or the impact of the different timing of mobility restriction intervention, are not systematically analyzed. here we apply the cumulative confirmed cases and mobility data of 350 chinese cities outside hubei to explore the relationships between all mobility patterns and epidemic spread, and estimate the impact of local travel restrictions, both in terms of level and timing, on the epidemic control based on mobility change. the relationships were identified by using pearson correlation analysis and stepwise multivariable linear regression, while scenario simulation was used to estimate the mobility change caused by local travel restrictions. our analysis shows that: (1) all mobility patterns correlated with the spread of the coronavirus in chinese cities outside hubei, while the corrleations droppd with the implemetation of travel restrictions; (2) the cumulative confirmed cases in two weeks after the wuhan lockdown was mainly brought by three patterns of inter-city population movement, while those in the third and fourth weeks after was significantly influenced by the number of intra-city population movement; (3) the local travel restrictions imposed by cities outside hubei have averted 1,960 (95%pi: 1,474-2,447) more infections, taking 22.4% (95%pi: 16.8%-27.9%) of confirmed ones, in two weeks after the wuhan lockdown, while more synchronized implementation would further decrease the number of confirmed cases in the same period by 15.7% (95%pi:15.4%-16.0%) or 1,378 (95%pi: 1,353-1,402) cases; and (4) local travel restrictions on different mobility patterns have different degrees of protection on cities with or without initial confirmed cases until the wuhan lockdown. our results prove the effectiveness of local travel restrictions and highlight the importance of synchronized implementation of mobility control across cities in mitigating the covid-19 transmission. previous studies prove it is the population mobility that accelerates the spatial spread of the epidemic, while travel restrictions could contribute to the epidemic control. [5] [6] [7] [8] [9] [10] [11] [12] [13] the spring migration ("chunyun" in chinese) of 2020 started on jan 10, which is the earth's most massive annual human migration. 13 although wuhan, the capital city of hubei province, suspended all in and out public transport since 10:00 am jan 23, 2020, about five million people already left wuhan before the quarantine. 14 studies show that while the wuhan lockdown greatly slowed the spread of covid-19, 6-10 the number of population emigration from wuhan highly correlated to the imported cases in other cities in china. 8, [11] [12] [13] [15] [16] [17] [18] while the impact of the population outflow from wuhan is well established, the impact of other mobility patterns on the epidemic trajectory has not been well understood. local population mobility for a city includes both intra-city and inter-city patterns. the inter-city mobility can be categorized into three sources, i.e., from wuhan, from hubei province (excluding wuhan), and from cities outside hubei. as nearly two-thirds of population outflow from wuhan flooded into other cities within the hubei province, 13, 17 it is important to consider the population outflow from hubei (excluding wuhan) as a potentially most significant source of epidemic transmission risk after jan 23. the intra-city population movement is also an essential factor-research shows that cities introduced pre-emptive intra-city movement restrictions have 33.3% less confirmed cases in the first week of the epidemic outbreak compared to those started restrictions after the emerging of confirmed cases, 6 pointing to the importance of timing of introducing these measures. for inter-city population movement among cities outside hubei, which are restricted after the wuhan lockdown, different studies hold inconsistent reviews. on the one hand, the implementation of inter-city travel restrictions cannot significantly reduce the number of confirmed cases during the first week of city outbreaks; 6 on the other hand, the transmission model of the covid-19 cannot be accurately established without the inter-city connections. 19 most of these existing studies are based on one or two mobility patterns, and the overall impact of these four mobility patterns on the spread of the covid-19 is not well understood. the uncertainty of the number of confirmed cases in the early stage of the epidemic spread, 5, 19 ranging from 427 (officially confirmed) to potentially over 10,000 underscores the importance of considering all four mobility patterns in the covid-19 spread, perhaps more so than the number of confirmed cases. in this study, we aim to investigate the relationship between population mobility, both inter-city and intra-city, and the spread of covid-19. based on the mobility change, the impact of local travel restrictions in other cities, besides the wuhan lockdown, on the epidemic control was estimated. given that the three mobility patterns, except the population outflow from wuhan, are heavily influenced by policy measures introduced by the central and local governments, such investigate is important in evaluating the effectiveness and understanding the influence of timing of different measures, which can in turn inform policy interventions in the future. we collected the number of laboratory-confirmed cases from daily official reports from the health commissions of the 34 provincial-level administrative units from jan 23, 2020. the provincial reports include the total instances as well as the breakdown for cities. the city-level data we used includes 367 cities, i.e., four cities directly under the central government, 333 prefecture-level cities, and 29 countylevel cities directly under the provincial government, covering all the areas of mainland china. a total of 350 cities are from outside hubei. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . to capture population movement among and within cities, we derived the mobility data from the baidu qianxi (http://qianxi.baidu.com/). 20 the data is calculated and analyzed from the location based service (lbs) of both baidu map and one flight path monitoring app --baidu tianyan. the data of both 2019 and 2020 were acquired, aligned by the chinese lunar calendar, from the start of spring migration (jan 10, 2020). the data of population inflow from wuhan, population inflow from hubei excluding wuhan, and intercity population movement were all calculated based on the baidu mobility index. the baidu mobility index records daily outflow and inflow to and from each of the 367 cities, which is comparable among cities. we assume there were five million people outflowed from wuhan between the start of the spring emigration and the wuhan lockdown, and scaled the index to approximate values of population size. for each day, the top 100 destination cities for population outflow from wuhan and other cities in hubei were recorded. we believe the data are representative of population outflow from wuhan and from hubei excluding wuhan. 19 for the inter-city connections, we used the daily inflow values of each city. the number of population migration from hubei was excluded since it was considered separately. the data of the intra-city population movement was counted based on the baidu intra-city mobility index. the intra-city mobility index, ranging from 0.3 to 8.0, reflects the proportion of people traveling within cities in the resident population. we used the 0.1 times of the daily intra-city mobility index, ranging from 3.0% to 80.0%, multiplied with the value of permanent population as the proxy daily population intra-city travel data. the data of the permanent population at the end of 2018 was retrieved from the statistical yearbook of provinces and cities. the imported cases and local infected ones might bring by different mobility patterns. thus, we used the cumulative confirmed cases of two time periods in cities outside hubei to study their relationship with different mobility data. the cumulative confirmed cases in two periods were computed to simulate the epidemic spread at different stages. the first period, referred hereafter as stage one, was the first two weeks after the wuhan lockdown, from jan 24 to feb 06. since the incubation period for the covid-19 usually ranged from one to 14 days, the majority of imported cases would be identified in this period. the second period was the two weeks after stage one, referred hereafter as stage two, from feb 07 to feb 20. most of the confirmed cases in this period should be infected in cities outside hubei. the ratio of new confirmed cases in the population outflow was typically used to estimate the virus transmission risk of population outflow. however, the epidemic outbreak might be underestimated at the early stage due to the lack of attention or detection ability. until the wuhan shutdown, only 427 cases were confirmed as positive for covid-19, while the 86% of all infections might be undocumented. 19 facing such a big gap, we investigate the relationship between mobility and epidemic spread without considering the proportion of the number of confirmed cases in population migration. to identify the impact of different mobility patterns on the epidemic spread, we used linear regression models to assess the relationship between population mobility data and the confirmed cases. 6 in stage one, the confirmed cases could be imported from hubei (both wuhan and cities outside wuhan), from cities outside hubei, or from the locality. therefore, the analysis was performed using the model: . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint ߝ is the constant coefficient that reflects information residue. since the mean incubation period of covid-19 was 5.2 days (95% confidence interval [ci], 4.1 to 7.0), 21 changing across studies, 7,22,23 we used the mobility data in one week before the stage one in the model. 7 the data of ‫ܧ‬ ሺ ‫ݓ‬ ሻ was from jan 17 to jan 23, while the data of ܱ and ‫ܫ‬ were all from jan 17 to jan 30. , we used the data after the wuhan lockdown because it was not the main risk of transmission before that. 17 thus, the data of ‫ܧ‬ ሺ ݄ ሻ were from jan 23 to the suspension of all interprovincial transport to and from hubei, jan 29 (table s1 ). as for stage two, the imported cases from hubei were not considered: ‫ܥ‬ is the number of initial confirmed cases until feb 06. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.02.20050781 doi: medrxiv preprint the statistical analysis included two steps. first, pearson correlation analysis was applied to check whether different mobility patterns were correlated with the spread of covid-19 in cities outside hubei in two time periods. correlated data would be introduced to the following linear regression. second, stepwise multivariate linear regressions were built to explore the explanatory capacity of mobility data to the confirmed cases. a model with the highest adjusted r 2 was taken as the best-fit one. cities without confirmed cases until the end of each stage were excluded from the study. besides for all the cities with confirmed cases after the wuhan lockdown ‫ܥ(‬ we assume, after the wuhan lockdown, the local travel restrictions in cities outside hubei contributed to the epidemic control by influencing population mobility. 10, 16 data for three mobility patterns, except the population outflow from wuhan, in two scenarios were obtained under assumptions (table s2) . we assume the transmission conditions and virus characteristics in china would remain unchanged, and the best-fit models generated from the regression analysis were used to estimate the number of confirmed cases in cities outside hubei based on the mobility data. the differences between the estimated ones and the actually reported ones are caused by the implementation of local travel restrictions, implying their impact on the epidemic control. to simulate the scenario of no travel restrictions in cites outside hubei (scenario 1), we used the 2019 baidu data and assumed the mobility scale captured in 2019 would be similar to those of the equivalent time period during 2020. 19 therefore, we adopted the mobility direction of 2020 and the mobility scale of . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.02.20050781 doi: medrxiv preprint 2019 to simulate the population mobility after jan 23, 2020, when there were no local travel restrictions. all data before the wuhan lockdown remained unchanged. the daily population outflow from hubei (excluding wuhan), inter-city population movement, and intra-city population movement after feb 03, 2019, aligned by the chinese lunar calendar with jan 23, 2020, were used as proxy mobility data for the no local travel restrictions status in cities outside hubei. to understand the role of the relative timing imposition of local travel restrictions in other cities from the wuhan lockdown, our scenario 2 assumes more timely travel restrictions being imposed in cities outside hubei. the same models and methods were used as above, and the only change was the mobility data. before jan 30, cities successively suspended their inter-provincial transport to and from hubei, and restricted their inter-city and their intra-city population movement to varying degrees. that is to say, the inter-city and intra-city population movements were all limited after jan 30. we assume all travel restrictions in cities outside hubei were imposed at the same time as the wuhan lockdown. the daily population outflow from hubei excluding wuhan, and inter-city and intra-city population movement from jan 31 to feb 06 replaced those from jan 23 to jan 30, as proxy mobility data for the timely restricting travels status. the pearson coefficients indicate positive statistical correlations between all mobility patterns and the number of confirmed cases in both stage one and stage two (p<0.01) (table 1-1 and 1-2, figure s1 ). it suggests that cities with more population inflow from wuhan and other cities in hubei, more population migration from cities outside hubei, and more intra-city population movement would have more . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . the best-fitting linear regression models for epidemic development in both stage one and stage two are listed in table 2 . in stage one, three mobility patterns, i.e., population inflow from wuhan and other cities in hubei, and inter-city population movement, together with the number of initial confirmed cases, could explain 73.3% of the inter-city differences in newly reported infections in cities outside hubei. the intra-city population movement is highly correlated with the inter-city one (r=0.812, p<0.01), but its impact on the number of cumulative confirmed cases is less than the inter-city one. this result implies the existence of imported cases from cities outside hubei. in stage two, intra-city population movement and the number of initial confirmed cases could explain 50.9 % of the inter-city differences in newly reported infections in cities outside hubei. the impact of inter-city population movement is not significant. in stage one, the epidemic spread in cities outside hubei, with or without initial confirmed cases, is significantly influenced by population inflow from hubei, including from wuhan and other cities in . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.02.20050781 doi: medrxiv preprint hubei. meanwhile, the inter-city population movement from cities outside hubei and intra-city population movement have varied impact. for cities without initial confirmed cases, both inter-city and intra-city population movements significantly influenced the epidemic development. in other words, four kinds of mobility patterns jointly affected the epidemic spread in these cities. for cities with initial confirmed ones, both inter-city (from cities outside hubei) and intra-city population movement have no significant impact on their epidemic development. two reasons might cause it. one is that the directly imported cases from hubei took the majority of reported ones in this stage, namely the significance of population inflow from hubei subjugating others. the cities with initial confirmed cases are the leading destinations for population outflow from hubei. these cities, taking 32.6% of cities outside hubei, accommodated 57.9% of population outflow from wuhan before its closure and 68.1% of population outflow from hubei excluding wuhan after the wuhan lockdown. the second reason is, in these cities, both inter-city (from cities outside hubei) and intra-city population movement are highly correlated with the number of population inflow from wuhan (r=0.700 and 0.748 respectively, p<0.01). the inter-city and intra-city mobility data were excluded from the best-fit model due to multicollinearity. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint most of the estimated increase in stage one is due to the population inflow from hubei after the wuhan lockdown. if only the population outflow from hubei was prohibited after the wuhan lockdown, i.e., the inter-city and intra-city population movement were not restricted in cities outside hubei, the estimated increase in the number of confirmed cases is 576 (95%pi: 437-716). thus, more than two-thirds of the expected increase in confirmed cases is due to the population outflow from hubei (excluding wuhan) if there were no travel bans except the wuhan lockdown. intra-city travel ban plays a significant role in preventing local infections. our results show there are 248 cities that have confirmed cases in stage two. if there is no intra-city travel restriction in these cities, the confirmed cases are estimated to increase by 33.1% (95%pi: 19.3%-46.9%) of observed ones. the intracity travel bans make the inter-city population movement from jan 31 to feb 13, after the public holiday of the spring festival, decreased to 50.1% of that in the last year, ranging from 23.7% to 87.8%. for these cities, the inter-city population movement was averagely decreased to 15.8% of that in the last year, ranging from 2.8% to 55.7%. it probably because of the strict inter-city travel ban that makes the influence of inter-city population movement insignificant on local infections. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.02.20050781 doi: medrxiv preprint although the travel restrictions in chinese cities outside hubei have played essential roles in the epidemic control, our results suggest that more timely travel restrictions after the wuhan lockdown could better control the spread of the virus. if all cities outside hubei imposed their inter-city, including to and from hubei, and intra-city travel bans at the same time as the wuhan lockdown, these cities would report 1,378 mobility control measures are of crucial importance for public health planning in the outbreak of the in this study, we explore the relationships between mobility and epidemic spread and if the whole hubei province was not quarantined after the wuhan lockdown, further national seeding and subsequent infections might become inevitable. by mar 12, 2020, hubei excluding wuhan has more confirmed cases (17, 795) than china excluding hubei (13, 032) . 24 our results suggest that, in cities outside hubei, the travel restriction of hubei (excluding wuhan) is more effective than other inter-city . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.02.20050781 doi: medrxiv preprint and intra-city travel bans in controlling the development of the epidemic in two weeks after the wuhan lockdown. on jan 26, all airports and railway stations in hubei were closed. 25 before jan 30, all other provinces in mainland china suspended their inter-provincial road transport to and from hubei (table s1) for cities without reported infections before the wuhan lockdown, the preventive prohibition of both inter-city and intra-city population movement is essential to their epidemic control. the prohibition of inter-city population movement from cities outside the hubei and intra-city population movement prevented 405 more confirmed cases (95%pi: 342-468) in stage one (table s3) , occupying 69.9% of the number of preventions by all local travel bans. the travel controls in cities without initial confirmed cases tend to be relatively late or loose than those in cities having initial infections. from jan 24 to jan 30, 2020, the inter-city population movement from cities outside hubei and intra-city population movement in cities without initial confirmed cases decreased on average to 64.4% and 72.6% of those in the same period of 2019, while the percentage are averagely 59.2% and 65.2% for cities with initial confirmed cases ( table . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.02.20050781 doi: medrxiv preprint s2). the local travel restrictions were necessary and practical to cities without initial confirmed cases, even if they were not that strict. different mobility patterns influenced the covid-19 spread in different periods. our results show, in the early stage of epidemic development, it is the inter-city mobility, including from wuhan, from hubei excluding wuhan, and from other cities outside hubei, that promotes the spatial spread of the virus. after the quarantine of the whole hubei and the prohibition of inter-city transport, the importance of restricting the intra-city population movement is highlighted. if there were no restrictions on intra-city population movement, the confirmed cases in cities outside hubei might increase 33.1% (95%pi: 19.3%-46.9%) in the third and fourth weeks after the wuhan lockdown. the intra-city travel restrictions played vital roles in the epidemic control in china. it is worth noting that china has implemented many non-pharmacological interventions, not limited to these travel restrictions. source control measures, like isolating people with the virus, monitoring or quarantining symptoms of healthy contacts, requiring masks for individuals in all public places, etc., have been introduced to reduce potential secondary infections. 26, 27 it is the intensive source control that reduces new local infections. the contribution of population movement from wuhan or hubei to subsequent epidemic development might also be dampened due to the implementation of source control measures. 7 without the implementation of combined source control measures, our study, at least in the model of stage two, might have different results. our study quantified the relationships between mobility patterns and epidemic trajectory in china and highlighted the importance of synchronized travel restrictions across cities. several policy implications . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.02.20050781 doi: medrxiv preprint can be drawn. first, the geographical extension of the quarantine should be carefully considered by the government before the official announcement. in the early stage of epidemic development, there might be a nonnegligible number of infected but undetected people in the areas that have geographical connections and frequent traffic with the epidemic-stricken ones, like other cities in hubei. in particular, there are many asymptomatic virus careers that are highly contagious. 23, 28 on the other hand, the expansion of the quarantined area would also bring substantial economic losses. whether to include farther hinterlands, and if so to what extent, will need to be carefully considered carefully in making the quarantine decision. second, our results show the importance of timely and active local countermeasures by cities outside of the epicenter. while actual travel control measures may differ across cities, simply by compressing the time gap between wuhan and other cities could further reduce the covid-19 outbreak. it is important for countries and governments to impose timely interventions to combat the pandemic. lh and bx designed the experiments. lh and lz collected data. lh analysed data and wrote the first draft, and ly made the figures. lh, bx, sh, and px interpreted the results and contributed the final manuscript. all authors approved the final version for submission. we declare no competing interests. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.02.20050781 doi: medrxiv preprint who announces covid-19 outbreak a pandemic ) 2. who. coronavirus disease 2019 (covid-19) situation report -52 there were no newly confirmed cases in four consecutive days in all provinces except for hubei cities and prefectures in hubei excluding wuhan have no new confirmed cases for 9 consecutive days covid-19: challenges to gis with big data the impact of transmission control measures during the first 50 days of the covid-19 epidemic in china the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak the effects of human mobility and control measures on the covid-19 epidemic in china impact of international travel and border control measures on the global spread of the novel 2019 coronavirus outbreak transmissibility of coronavirus disease 2019 (covid-19) in chinese cities with different transmission dynamics of imported cases temporal relationship between outbound traffic from wuhan and the 2019 coronavirus disease (covid-19) incidence in china estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china covid-19 control in china during mass population movements at new year. the the press conference on covid-19 in hubei distribution of the 2019-ncov epidemic and correlation with population emigration from wuhan estimation of local novel coronavirus (covid-19) cases in wuhan, china from off-site reported cases and population flow data from different sources risk assessment of exported risk of novel coronavirus pneumonia from hubei province nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia the lockdown of hubei is in process: 17 cities stopped public transportation covid-19: what is next for public health? taking the right measures to control covid-19. the lancet infectious diseases can we contain the covid-19 outbreak with the same measures as for sars? the lancet infectious diseases we collated epidemiological data and mobility data from publicly available data sources. all the data sources we used are documented in the main text and supplementary tables. key: cord-351659-ujbxsus4 authors: jiang, xiandeng; chang, le; shi, yanlin title: a retrospective analysis of the dynamic transmission routes of the covid-19 in mainland china date: 2020-08-19 journal: sci rep doi: 10.1038/s41598-020-71023-9 sha: doc_id: 351659 cord_uid: ujbxsus4 the fourth outbreak of the coronaviruses, known as the covid-19, has occurred in wuhan city of hubei province in china in december 2019. we propose a time-varying sparse vector autoregressive (var) model to retrospectively analyze and visualize the dynamic transmission routes of this outbreak in mainland china over january 31–february 19, 2020. our results demonstrate that the influential inter-location routes from hubei have become unidentifiable since february 4, 2020, whereas the self-transmission in each provincial-level administrative region (location, hereafter) was accelerating over february 4–15, 2020. from february 16, 2020, all routes became less detectable, and no influential transmissions could be identified on february 18 and 19, 2020. such evidence supports the effectiveness of government interventions, including the travel restrictions in hubei. implications of our results suggest that in addition to the origin of the outbreak, virus preventions are of crucial importance in locations with the largest migrant workers percentages (e.g., jiangxi, henan and anhui) to controlling the spread of covid-19. www.nature.com/scientificreports/ been analyzed and compared over a rage of affected countries (e.g., australia 34, 34 , germany 35 , italy 36, 37 and south korea 38 ). among those emerging large volume of studies, mathematical and statistical modeling plays a non-negligible role. also, the classical susceptible exposed infectious recovered (seir) model with its various extensions is the most popular method [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] . seir family models are effective in exploring the epidemic characteristics of the outbreak, forecasting the inflection point and ending time, and deciding the measures to curb the spreading. despite this, they are less appropriate in identifying transmission routes of the covid-19 outbreak, which is also not thoroughly investigated in existing literature. in this paper, we fill in this gap and perform a retrospective analysis using the publicly available data 53 . rather than employing the seir, we develop a time-varying coefficient sparse vector autoregressive (var) model. using the least absolute shrinkage and selection operator (lasso) 54, 55 and the local constant kernel smoothing estimator 56 , our model is capable of estimating the dynamic high-dimensional granger causality coefficient matrices. this enables the detection and visualization of time-varying inter-location and self-transmission routes of the covid-19 on the daily basis. the resulting "road-map" can help policy-markers and public-health officers retrospectively evaluate both the effectiveness and unexpected outcomes of their interventions. such an evaluation is critical to winning the current battle against covid-19 in china, providing useful experience for other countries facing the emerging threat of this new coronavirus, and saving lives when a new epidemic occurs in the future. model. throughout this study, we are interested in the growth rate y i,t such that: where x i,t is the accumulated confirmed cases in the provincial-level administrative region (location, hereafter) i on day t ( i = 1, . . . , n and t = 1, . . . , t ). t and n define the number of days and number of locations under consideration, respectively. we then define y t = (y 1,t , . . . , y n,t ) ′ , an n × 1 vector of the growth rate on day t. to investigate a dynamic direct transmission of the growth rate among locations, we propose a time-varying coefficient sparse var model, namely the tvsvar model, which assumes that granger causality coefficients are functions of time, such that: where α t is an n-dimensional intercept vector at time t. b t is an n × n granger causality matrix at time t with a dynamic sparse structure, for which entries can be exactly zero and the locations of zeros can vary with time. ǫ t is an n × 1 vector of error terms. the sparsity of b t is assumed because n could be even larger than t in our case, which leads to very unstable estimations and problematic interpretations of b t . one important benefit of using the proposed tvsvar to model the transmissions is that the granger causality matrix, b t , can provide both the direction and strength of the route on day t. for example, the ijth entry in b t measures the strength of the transmission from location i to location j on day t. the ith diagonal of b t represents the self-transmission in location i that captures the relationship between the growth rate in the current and previous days. more critically, the sparse structure eases the interpretation of b t because many weak transmissions may be of a random nature. the corresponding coefficients, therefore, can be treated as noises and are shrunk to zeros exactly. moreover, a time-varying design of b t allows us to investigate changes in the identified transmissions over time. for instance, let 11, 14 and 17 indicate hubei, jiangxi and shanghai, respectively. on day t = 1 , the estimated β 11,14,1 and β 14,17,1 are 0.52 and 0.35, respectively. this suggests on that day, moderately strong transmission routes of confirmed covid-19 cases are detected from huber to jiangxi and from jiangxi to shanghai, respectively. further, estimated β 17,i,1 for all i = 1, . . . , 20 are zeros, suggesting that the confirmed cases in shanghai cannot spread to other locations on day 1. on day t = 2 , we observe estimated β 11,14,1 = 0.61 , β 14,17,1 = 0.41 and all β 17,i,1 = 0 . thus, the two detected routes from huber to jiangxi and jiangxi to shanghai have become more influential, whereas the cases in shanghai are still yet to spread out on day 2. the above results cannot be derived using the classic epidemiological seir model. to capture both dynamic and sparse structure of the granger causality coefficients, we solve the following optimization problem: we use the epanechnikov kernel k(x) = 0.75(1 − x 2 ) + and a unified bandwidth for each i ( b i ≡ b ) to avoid a large number of tuning parameters. the coefficients β i,j,t denotes the ijth entry of the granger causality matrix b t , and is the tuning parameter that aims to shrink insignificant β i,j,t to zero and thus controls the sparsity of b t . another essential feature of our proposed model is that the adaptive weights w i,j,t are employed to penalize β i,j,t differently in the lasso (l1) penalty 54, 55 . the choice of weights w i,j,t takes account of the prior knowledge about the transmissions and can be specified by the users. in this study, we consider w i,j,t as the reciprocal of the accumulated confirmed in location i on day t − 1 . that is, the growth rate of a location with a smaller accumulated confirmed cases is less likely to influence the growth rates of others, and thus, more likely to be shrunk to zero. the final sparsity structure of b t is still data-driven. (1) www.nature.com/scientificreports/ the estimators as in (3) can also be viewed as a penalized version of local constant kernel smoothing estimator 56 . we utilize a modified version of the fast iterative soft thresholding algorithm (fista) 57 to solve the optimization problem (3) . given a bandwidth b and a penalty parameter , we can find the estimator ( α t , b t ) for each day t and observe the dynamic patterns of the transmission over time t for each pair of locations. the selection of b is critical to detecting the influential routes, which depends on the chosen criterion. among the existing literature, a popular approach is to adopt the cross-validation strategy, such that based on the estimated (α t , b t ) , the model will not 'overfit' y t . as for the time-series analysis, we use an expanding-window sample to implement the crossvalidation 58 . this requires that the chosen b will minimize the cross-validated forecast error, which is measured by the one-step-ahead root mean squared forecast error (rmsfe), such that where [t 0 , t 1 ] is the evaluation period, which is given by the last third of the data in our study, y i,t+1 denotes the one-step-ahead forecast for location i based on the data up to day t, and y i,t+1 defines the observed growth rate at day t + 1 for location i. note that rmsfe is analogous to the square root of the popular least squared errors. an interpretation is that the chosen b will lead to the minimized total out-of-sample forecast errors of the growth rates of confirmed cases over the last third of the sample period. data. the data studied in this paper include confirmed covid-19 cases which occurred in mainland china. the data are publicly available and sourced from the website of the national health commission of the people's republic of china 53 . the data-coverage ranges from january 29, 2020 to february 19, 2020, during which no missing data were recorded at location-level. the accumulated cases and the associated growth rates, grouped by the total national number, cases in hubei and cases in all other locations, are plotted in fig. 1a ,b, respectively. the total national (hubei) accumulated confirmed cases increased rapidly from 7,736 (4,586) on january 31, 2020 to 75,101 (62,457) on february 19, 2020. note that on february 12, 2020, confirmed cases in hubei included those confirmed by both laboratory and clinical diagnosis, leading to a one-time hump of the accumulated number. compared to those of hubei, confirmed cases of other locations took up a smaller proportion of the total national number, ranging from 40.7% on january 29, 2020 to 16.8% on february 19, 2020. this suggests that the growth rate of other locations should be lower than that of hubei, which is consistent with fig. 1b . throughout our investigation period, except for the one-time hump on february 12, 2020, growth rates of hubei and the rest steadily declined, from 33% and 25% to 5% and 1%, respectively. estimation results: transmission routes. by taking the difference of the logged accumulated cases and applying one lag, our estimated transmission routes are available from january 31 to february 19, 2020 (two observations are lost). to avoid potential noises caused by small numbers, we only include data of locations, which had at least 150 accumulated confirmed cases as of february 19, 2020. altogether, our modeled sample contains 20 location-level confirmed cases. we firstly test the stationarity of the 20 growth rates separately. based on the augmented dickey-fuller test, only the rates of three locations (beijing, hainan and heilongjiang) are insignificant, which is in-line with the employed 10% significance level. the detailed results are available upon request. the model explained in "methods" is then fitted incorporating all the 20 growth rates. a non-zero estimate of β i,j,t , the ijth entry of b t in (2), indicates that on the tth day, the growth rate of location j is granger caused by that of location i. in other words, there is a transmission route from location i to location j. among the 20-day results, we noticed that the estimated transmission routes on days 1-5 changed considerably on daily basis. from the sixth day onwards, however, those estimated routes were more steady. hence, we plot the estimates on days 1-5 and those on the every fifth day thereafter, on fig. 2 . be noted that estimates smaller than 0.2 (none-influential) are not presented a better visual illustration purpose. also, this research focuses on the analysis of mainland china only, which excludes taiwan, macau, hong kong and all important islands of china's territory, such as those located in the south china sea. the plots presented in fig. 2 therefore do not present a complete map of the territory of china, nor should they be used for purposes other than displaying identified transmission routes of covid-19 in mainland china. the readers are directed to the national administration of surveying, mapping and geographic information of the people's republic of china, should they need to precisely explore the scope of maps, national boundaries and the drawing of important islands of the chinese territory. in fig. 2 , we use color of light orange (small) to dark red (large) indicating the accumulated confirmed cases in each location, up to time t. estimated transmission routes are colored in blue. self-transmissions (indicated by β i,i,t ) are denoted by dots, and a larger size of dot suggests a larger estimated β i,i,t . inter-location transmission (indicated by β i,j,t , where i = j ) is represented by arrows, with the transparency indicating the magnitude of estimated β i,j,t . on the first day (january 31, 2020), there were influential inter-location transmissions from hubei to jiangxi, heilongjiang, zhejiang, henan, shandong, jiangsu and shaanxi, sorted by the magnitudes of strength (big to small). there were a few additional detected such transmissions on the second day, including those from hubei to guangxi, from jiangxi to fujian, and from guangdong to anhui, yunnan and hunan. the number of such identified inter-location routes, however, reduced rapidly over the next three days. on the fifth day (february 4, 2020), no influential transmission routes were found from hubei to directly affect other locations, and there were only three influential routes identified nationally, including zhejiang-shaanxi, www.nature.com/scientificreports/ zhejiang-jiangxi and jiangxi-shanghai. the number of those detected inter-location routes declined again in the next few days, and on day 13, only henan-heilongjiang was found influential. on days 19 and 20 (february 18 and 19, 2020), there were no influential inter-location transmissions identified. the above findings suggest that the number of influential inter-location transmissions overall dropped quickly in the first five days and then reduced steadily for the rest 15 days. this is consistent with the observations of fig. 3a, where the time-varying estimates of the granger causality of hubei on other locations are plotted. on each day, we report the mean, standard deviation (std. dev.), the 25% quantile ( q 1 ) and 75% quantile ( q 3 ) of those estimates in table 1 , which also leads to consistent findings. www.nature.com/scientificreports/ as for the self-transmission, we firstly examine (fig. 2b) . it can bee seen that there were quite a few detected influential self-transmissions on the first two days. however, this number dropped quickly over days 3-5, and only self-transmissions of heilongjiang, guangdong and zhejiang were found influential on day 5. since then, the number of influential self-transmissions increased quickly with growing magnitudes (influence). on the sixteenth day (february 15, 2020), 16 out of the 20 examined locations had an estimated β i,i,20 of at least 0.2. those large self-transmissions, however, disappeared rapidly again in the next 3 days. on february 18 and 19, 2020, there were no influential self-transmissions identified. this is consistent with our findings on fig. 3b , where time-varying estimated β i,i,t are plotted for each location. we report daily descriptive statistics of those estimates in table 1 , which also results in consistent conclusions. since 23 january, 2020, many cities on mainland china started to introduce travel restrictions, including five cities (wuhan, huanggang, ezhou, chibi and zhijiang) of hubei 15 . according to who's situation report 59 , the average incubation period of covid-19 is up to 10 days. thus, our estimated dynamic transmission routes supports the significant effectiveness of the interventions taken by the chinese authorities [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] . this is evidenced by fig. 2a-e, where the number of influential inter-location transmissions from hubei to other locations reduced very quickly. compared to multiple influential routes originating in hubei detected on the first two days (january 31 and february 1, 2020), by february 4, 2020 (around 10 days after the travel restrictions), there were already no such transmissions identified. on the other hand, from february 5 to 16, 2020, table 1 suggests that the averaged magnitudes of self-transmission on each day were strengthening steadily. this may also be explained by the interventions, which have effectively blocked inter-location transmissions, such that the growth rate of each location could only be caused by its internal transmissions. we now focus on the inter-location transmission routes. since influential routes from hubei were no longer detected since day 5, we calculate the average β i,j,t of the 19 locations affected by hubei over the first four days table 2 . this is consistent with the fact that travel restrictions in hubei should not affect the connections among other locations. in all cases, jiangxi, henan, guangdong, zhejiang and anhui are the most influential origins other than hubei. it is worth noting that jiangxi, henan and anhui belong to both the top origins and destinations of the interlocation transmissions, excluding hubei. since the impact of hubei is not considered, this cannot be explained by the two influential transmission routes of hubei-jiangxi and hubei-henan listed in panel a of table 2 . to see this, over days 5-20, the transmissions out of hubei are no longer significant and thus should not affect routes from jiangxi and henan to another location. in contrast, one explanation is the large migrant workers from jiangxi, henan and anhui to other locations (excluding hubei). according to the report on china's migrant population development of 2017 60 , jiangxi (7.25%), henan (6.30%) and anhui (6.27%) are among the top five locations in mainland china, ranked by the percentages of migrant workers in 2017. www.nature.com/scientificreports/ conclusions coronaviruses have lead to three major outbreaks ever since the sars occurred in 2003. although the exact origin is still debatable, the current shock, namely covid-19, has taken place in wuhan, the capital city of hubei province in mainland china. as the fourth large-scale outbreak of coronaviruses, covid-19 is spreading quickly to all provincial-level administrative regions (locations, hereafter) in china and has recently become a world-wide epidemic. as a significant complement to existing research, this study employs a tvsvar model and retrospectively investigates and visualizes the transmission routes in mainland china. demonstrated in fig. 2 , our baseline results review both the dynamic inter-location and self-transmission routes. since february 4, 2020, the spread out of hubei was largely reduced, leading to no identifiable routes to other locations. simultaneously, the self-transmissions started to accelerate and peaked on around february 15, 2020 for most locations. given an average incubation period of 10 days, those results support the argued effectiveness of the travel restrictions to control the spread of covid-19, which took place in multiple cities of hubei on january 23, 2020. on february 18-19, 2020, there existed no influential inter-location or self-transmission routes. thus, the growth rates of confirmed cases are of a more random nature in all locations thereafter, implying that the spread of covid-19 has been under control. for the detected inter-location transmissions, our findings demonstrate that jiangxi, heilongjiang, zhejiang, henan and shandong are the top 5 locations affected mostly via routes directly from hubei. when the influence of hubei is excluded, jiangxi, henan and anhui are among both the top origins and destinations of transmission routes. our results have major practical implications for public health decision-and policy-makers. for one thing, the implemented timely ad-hoc public health interventions are proven effective, including contact tracing, quarantine and travel restrictions. for another, apart from the origin of the virus, as locations with largest migrant workers percentages, virus preventions are also of crucial importance in jiangxi, henan and anhui to controlling the epidemics like the outbreak of covid-19 in the future. with limited resources, taking ad-hoc interventions in such locations may most effectively help stop the spread of a new virus, from an economic perspective. the r code that supports the findings of this study is available from the author on request. www.nature.com/scientificreports/ publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. emerging coronaviruses: genome structure, replication, and pathogenesis history and recent advances in coronavirus discovery pneumonia of unknown aetiology in wuhan, china: potential for international spread via commercial air travel a pneumonia outbreak associated with a new coronavirus of probable bat origin a new coronavirus associated with human respiratory disease in china clinical features of patients infected with 2019 novel coronavirus in wuhan new sars-like virus in china triggers alarm outbreak of pneumonia of unknown etiology in wuhan china: the mystery and the miracle china coronavirus: cases surge as offcial admits human to human transmission therapeutic options for the 2019 novel coronavirus (2019-ncov) epidemic analysis of covid-19 in china by dynamical modeling estimation of the transmission risk of the 2019-ncov and its implication for public health interventions an updated estimation of the risk of transmission of the novel coronavirus (2019-ncov) lockdown may partially halt the spread of 2019 novel coronavirus in hubei province the lockdown of hubei province causing different transmission dynamics of the novel coronavirus (2019-ncov) in wuhan and beijing the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak incubation period of 2019 novel coronavirus (2019-ncov) infections among travellers from wuhan, china assessing spread risk of wuhan novel coronavirus within and beyond china effectiveness of airport screening at detecting travellers infected with novel coronavirus (2019-ncov) interventions targeting air travellers early in the pandemic may delay local outbreaks of sars-cov-2. medrxiv feasibility of controlling 2019-ncov outbreaks by isolation of cases and contacts the impact of traffic isolation in wuhan on the a simple model to assess wuhan lock-down effect and region efforts during covid-19 epidemic in china mainland the lockdown of hubei province causing different transmission dynamics of the novel coronavirus (2019-ncov) in wuhan and beijing an investigation of transmission control measures during the first 50 days of the covid-19 epidemic in china how will country-based mitigation measures influence the course of the covid-19 epidemic the effect of human mobility and control measures on the covid-19 epidemic in china impact of international travel and border control measures on the global spread of the novel 2019 coronavirus outbreak feasibility of controlling covid-19 outbreaks by isolation of cases and contacts the effectiveness of quarantine and isolation determine the trend of the covid-19 epidemics in the final phase of the current outbreak in china modelling transmission and control of the covid-19 pandemic in australia inferring change points in the spread of covid-19 reveals the effectiveness of interventions spread and dynamics of the covid-19 epidemic in italy: effects of emergency containment measures modelling the covid-19 epidemic and implementation of population-wide interventions in italy transmission potential and severity of covid-19 in south korea a mathematical model for simulating the transmission of wuhan novel coronavirus a time delay dynamical model for outbreak of 2019-ncov and the parameter identification a data driven time-dependent transmission rate for tracking an epidemic: a case study of 2019-ncov epidemiological identification of a novel infectious disease in real time: analysis of the atypical pneumonia outbreak in wuhan estimating the daily trend in the size of covid-19 infected population in wuhan estimation of the asymptomatic ratio of novel coronavirus infections (covid-19) the extent of transmission of novel coronavirus in wuhan, china, 2020 novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions the novel coronavirus, 2019-ncov, is highly contagious and more infectious than initially estimated nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study simulating the infected population and spread trend of 2019-ncov under different policy by eir model epidemiological and clinical features of the 2019 novel coronavirus outbreak in china predictions of 2019-ncov transmission ending via comprehensive methods preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak national health commission of the people's republic of china regression shrinkage and selection via the lasso the adaptive lasso and its oracle properties local polynomial regression: optimal kernels and asymptotic minimax effciency a fast iterative shrinkage-thresholding algorithm for linear inverse problems principles and practice national health and family planning commission of china. report on china's mi-grant population development the authors would like to thank southwestern university of finance and economics, australian national university and macquarie university for their support. xiandeng jiang acknowledges the research grants supported by the people republic of china ministry of education youth project for humanities and social science research (grant no. 20yjc790051). we particularly thank the deputy editor (rafal marszalek) and two anonymous referees for providing valuable and insightful comments on earlier drafts. the usual disclaimer applies. this research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. x.j. collected data and designed the research. l.c. and y.s. performed the research and analyzed the data. x.j., l.c., and y.s. wrote the paper. the authors declare no competing interests. correspondence and requests for materials should be addressed to y.s.reprints and permissions information is available at www.nature.com/reprints. key: cord-283891-m36un1y2 authors: hu, bisong; qiu, jingyu; chen, haiying; tao, vincent; wang, jinfeng; lin, hui title: first, second and potential third generation spreads of the covid-19 epidemic in mainland china: an early exploratory study incorporating location-based service data of mobile devices date: 2020-05-17 journal: int j infect dis doi: 10.1016/j.ijid.2020.05.048 sha: doc_id: 283891 cord_uid: m36un1y2 abstract objectives the outbreak of atypical pneumonia caused by the novel coronavirus (covid-19) has currently become a global concern. the generations of the epidemic spread are not well known, yet these are critical parameters to facilitate an understanding of the epidemic. a seafood wholesale market and wuhan city, china, were recognized as the primary and secondary epidemic sources. human movements nationwide from the two epidemic sources revealed the characteristics of the first-generation and second-generation spreads of the covid-19 epidemic, as well as the potential third-generation spread. methods we used spatiotemporal data of covid-19 cases in mainland china and two categories of location-based service (lbs) data of mobile devices from the primary and secondary epidemic sources to calculate pearson correlation coefficient,r, and spatial stratified heterogeneity, q, statistics. results two categories of device trajectories had generally significant correlations and determinant powers of the epidemic spread. bothr and q statistics decreased with distance from the epidemic sources and their associations changed with time. at the beginning of the epidemic, the mixed first-generation and second-generation spreads appeared in most cities with confirmed cases. they strongly interacted to enhance the epidemic in hubei province and the trend was also significant in the provinces adjacent to hubei. the third-generation spread started in wuhan from january 17 to 20, 2020, and in hubei from january 23 to 24. no obvious third-generation spread was detected outside hubei. conclusions the findings provide important foundations to quantify the effect of human movement on epidemic spread and inform ongoing control strategies. the spatiotemporal association between the epidemic spread and human movements from the primary and secondary epidemic sources indicates a transfer from second to third generations of the infection. urgent control measures include preventing the potential third-generation spread in mainland china, eliminating it in hubei, and reducing the interaction influence of first-generation and second-generation spreads. an outbreak of atypical pneumonia caused by the 2019 novel coronavirus (covid-19) was recognized from middle january, 2020, in wuhan city, china. the novel coronavirus that infects human was first reported in wuhan, hubei province, china, on december 31, 2019 (zhu et al. 2020) . early confirmed cases were mainly linked to a seafood wholesale market in wuhan (li et al. 2020a; zhu et al. 2020) . epidemiological studies indicate that the covid-19 epidemic has a basic reproductive number between 2 and 3 (li et al. 2020a; wu et al. 2020) , which is lower than the 2003 severe acute respiratory syndrome (sars) (lipsitch 2003; riley et al. 2003) . wuhan is a main transportation hub in central china, several million travelers ventured outward from the epidemic outbreak source in the first half of january, 2020, due to annual chinese (lunar) new year holiday migrations. the large-scale outbreak started on january 19 (the first confirmed case reported outside hubei province). although strict transportation screening measures were activated by many cities in the next 3-4 days, the epidemic rapidly spread nationwide in a week. moreover, covid-19 infections have been identified in other countries and the current epidemic has become a global concern (cohen and normile 2020; holshue et al. 2020; rothe et al. 2020; . the world health organization (who) declared the covid-19 outbreak as a public health emergency of international concern (pheic) on january 30 (who 2020b) . there is evidence that the epidemic outbreak in china and elsewhere spread along the paths of travel from wuhan (li et al. 2020b) , and local outbreaks could appear in other major cities of china with time lags (wu et al. 2020) . massive human movements via railways and domestic/international airlines from wuhan, and the timing of chinese new year, has enabled the virus to spread nationwide and worldwide (peeri et al. 2020) . control measures (e.g., travel quarantine and restrictions) in wuhan were effective to delay the overall epidemic progression in mainland china and reduce the international case importations (chinazzi et al. 2020) . the huanan seafood wholesale market and wuhan were recognized as the primary and secondary epidemic centers, respectively, and therefore, the movements of populations from the two sources influenced the generations of the covid-19 epidemic in mainland china, especially during the very early epidemic stage before the transportation measures activated by wuhan and other cities. the first-generation (primary) spread of the epidemic was in part reflected by the human movement from the primary source (i.e., the seafood market), and the secondgeneration (secondary) spread was reflected by that from the secondary center (i.e., wuhan city). they varied and interacted by region and time during the early epidemic progression, and had the potential clues to identify the third-generation spreads in various regions, which are mainly caused by the local cases instead of the imported ones. here, using location-based service (lbs) data of mobile devices, we analyzed the spatiotemporal association of the confirmed covid-19 cases and human movements from the sources of the epidemic outbreak, and revealed the first, second and potential third generation spreads of the covid-19 epidemic in mainland china. we collected spatiotemporal data of covid-19 cases in mainland china from the daily bulletins of the national health commission of the people's republic of china (nhc) and various provincial/municipal health commissions. some publicly available news and media were utilized as supplemental data. the final epidemic dataset was comparatively verified through the public platform of the 2019-ncov-infected pneumonia epidemic from the chinese center for disease control and prevention (china cdc 2020a) . the dataset of the covid-19 cases includes the following fields: date (starting from january 10, 2020), province code/name, city code/name, and numbers of daily new suspected/confirmed cases. from the above dataset, we can generate the cumulative number of daily confirmed cases at a specific city s and until a given end date t, which is denoted by ys,t. the human movement of populations from two epidemic sources (the huanan seafood wholesale market and wuhan), were considered to be associated with the spatiotemporal epidemic spread. the datasets of lbs requests from mobile devices were provide by wayz inc., shanghai, china. the device trace datasets cover over 80% mobile devices supported by the three telecommunication operators in china. the lbs-requesting statistics are implemented every two hours with highresolution location information. the raw data indicate the individual trajectories of numerous mobile devices with high-resolution spatiotemporal information, and can be easily aggregated in a specific spatial scale and within a given time step. for a subpopulation from the epidemic center, we can aggregate the device trace data from the start date to a given end date t, and the corresponding cumulative number at a specific city s is denoted by xs,t. multiple lbs requests within a time step are only counted once by a same device. private individual information was deleted from the raw data of the mobile devices, and in this study, the device trace data was aggregated to the administrative cities and the epidemic date, i.e., the mobile device traces were associated with the j o u r n a l p r e -p r o o f epidemic dataset according to date and location. these aggregated statistics of mobile device traces are expected to be representative of the human migrations from the epidemic sources. two epidemic sources were considered, including the seafood wholesale market and wuhan city. the devices which activated their lbs requests in the market in november 2019 indicated the potential first-generation cases of the covid-19 epidemic. and the potential second-generation cases were those which were activated in wuhan in december 2019 and then traveled to other regions in january 2020. , ( ) and , ( ) are used to denote the spatiotemporal trajectories of the above two subpopulations of mobile devices, respectively. all the processing and aggregation of mobile device trace data were implemented by the provider. the final datasets include the daily counts of two categories of trajectories in all the administrative cities in mainland china. the cumulatively summed device traces had a spatially distributed consistency with the population distribution in mainland china ( figure 1 ). two categories of trajectories mainly spread to the provinces adjacent to hubei and several developed areas a longer distance from hubei, such as guangdong province, zhejiang province and beijing. we considered the spread of the epidemic from the source in various space and time domains, and the corresponding associations with human movements were analyzed in several temporal divisions and spatial scales. seven areas were delineated, including i) wuhan city, ii) hubei province excluding wuhan, iii) hubei province, iv) hubei's adjacent provinces (anhui, chongqing city, henan, hunan, jiangxi and shaanxi), v) mainland china excluding hubei, vi) mainland china excluding wuhan, and vii) mainland china. date periods were generated using three key date stamps, including january 10, 2020 (when the first 41 confirmed cases were reported in wuhan), january 19 (when the large-scale outbreak started) and january 26 (the end of the first week of the largescale outbreak). based on the above datasets of covid-19 cases in mainland china and two categories of location-based service data of mobile devices from the epidemic sources, we calculated their pearson correlation coefficient, r, and spatial stratified heterogeneity (ssh), q, statistics. pearson correlation is usually used to evaluate the linear association between two variables and calculated as follows: (1) where rxy denotes the correlation coefficient of covid-19 spatiotemporal spread and human migrations from the epidemic source, within the period from the start date to a given end date t. ys,t is the cumulative number of daily confirmed cases at city s and xs,t is the cumulative number of device trajectories from the epidemic source, with the mean values of ̄ and , respectively. n is the number of the administrative cities in mainland china. in this study, we calculated two pearson correlations with the spatiotemporal data of two categories of trajectories, , ( ) and , ( ) , to explore the associations between the epidemic spread and the human migrations from the seafood market and wuhan, respectively. the geodetector q statistic is generally applied to quantitatively evaluate the ssh of an explained j o u r n a l p r e -p r o o f variable (wang et al. 2010 (wang et al. , 2016 , and assess the determinant power of explanatory variables and their interaction, without linear assumptions (yin et al. 2019) . the fundamental formula of the q statistic is given by: where q is the determinant power of the factor to the objective. n is the number of objective variable observations and σ 2 indicates the variance of all the observations. the objective is stratified into l stratums, denoted by h =1, 2, …, l, which is determined by the determinant factor. nh is the number of observations and ℎ 2 is the corresponding variance within stratum h. the value of q ranges from 0 to 1. we calculate q statistic to assess the determinant power of human migrations from the epidemic source to covid-19 spatiotemporal spread. similarly, the spatiotemporal data of two categories of trajectories can be applied to calculate two q statistics for the two epidemic sources. within the period from the start date to a given end date t, we implemented the stratification by the equalinterval division after ordering the trajectory data, xs,t, and divided all the observations into 5 strata to calculate the q statistic of the cumulative trajectories, xs,t, to the cumulative cases, ys,t. this is a common stratification way to deal with the numerical independent variables (yin et al. 2019) , which can reduce the subjective influence of various stratifications to q statistics. moreover, for two or more determinant factors, an interaction q statistic can be calculated to measure their interaction influences (e.g., are they independent, or do they weaken/enhance each other?) (wang et al. 2010) . in this study, two categories of trajectories, , ( ) and , ( ) , were used to implement the stratifications and the corresponding q statistics were calculated, respectively, which are denoted by q (m) and q (w) . while the stratification was generated by the intersection between the above two individual stratifications, an interaction q statistic, q (m∩w) , can be calculated, where the symbol "∩" denotes the intersection between two strata layers. various interaction types can be defined according to the comparison between q (m) , q (w) and q (m∩w) (wang et al. 2010) . for instance, "q (m∩w) > q (m) and q (w) " indicates a bi-enhancement interaction between two categories of trajectories in facilitating the spread of the epidemic (see wang et al. 2010 for more details about the interaction q statistic). analyses in this study were performed with the use of the r software package (r foundation for statistical computing) and thematic mapping was implemented in the arcgis platform (esri). similar to the spatial distributions of the mobile device traces (figure 1 ), the pearson correlations r and q statistics between the cumulatively summed cases and two categories of trajectories up to january 26, 2020 had a spatially distributed consistency with the population distribution among the administrative cities in mainland china ( figure 2 ). two categories of trajectories had generally significant correlations and determinant powers of the epidemic spread, and both r and q decreased in distance from the epidemic sources. the first-generation and second-generation transmissions of the infection simultaneously appeared in many cities at the early stage of the outbreak. specifically, devices activated in the market displayed higher values of r and q in several small and medium cities than devices activated in wuhan city (figures 2a and 2c) . it is clear that many cities executed a quick response and activated transportation control measures, which helped control the first-generation epidemic spreads. the r and q statistics of the devices activated in wuhan, however, indicate that the second-generation spread still influenced many cities in the first week of the outbreak ( figures 2b, 2d and table 1 ). the market trajectories received a much higher pearson correlation value to confirmed cases in wuhan (r=0.6160, p<0.001) than hubei province excluding wuhan (r=0.3741, p<0.001) and mainland china excluding hubei (r=0.3319, p<0.001) . the correlations of wuhan trajectories were 0.7438, 0.5874 and 0.5183 in the above three areas, respectively. the temporal correlation curves of both market and wuhan trajectories have obvious decreasing trends from january 17 to 20, 2020 in wuhan ( figure 3a) , which indicates the potential start date of the third-generation epidemic spread. one week after this, market trajectories had higher pearson correlation values than wuhan trajectories, and the first-generation spread still had a serious influence in wuhan ( figure 3a) . similarly, in hubei province excluding wuhan, the potential start date of the third-generation spread was from january 23 to 24 ( figure 3b) . moreover, the second-generation spread played a dominant role in the areas outside wuhan, especially in hubei province excluding wuhan and the provinces adjacent to hubei, since wuhan trajectories had much higher values of correlations ( figures 3b and 3c ). we found no obvious turning dates in the areas outside hubei ( figures 3c and 3d) , and the potential third-generation spread remains to be determined. the curves have remained stationary since january 22 in mainland china excluding hubei ( figure 3d ). the transportation control measures activated by many cities since january 21 appeared to have been successful in partially controlling the first-generation and second-generation epidemic spreads outside hubei province. we focused on the first week of the large-scale outbreak and calculated the q statistics of the two device-activation categories in introducing cumulative confirmed cases in various areas (table 1) . the determinant powers of both categories were extremely high and consistent in wuhan (q=0.8909, p<0.05). their temporal curves had the obvious decreasing trends from january 17 to 20 ( figure 4a ), which validated the start date of the third-generation spread in wuhan. similar validation was observed in hubei province excluding wuhan ( figure 4b ). two categories of trajectories can explain nearly 100% ssh of the epidemic spread in wuhan before the large-scale outbreak and the ssh increased constantly since the third-generation spread stage ( figure 4a ). the market and wuhan trajectories had close determinant powers in introducing the epidemic spread in hubei province (q=0.4153, q=0.4261, respectively, and p<0.001). the q statistics reported that these two categories explained 41.53% and 42.61% ssh of the confirmed cases in hubei. the determinant powers of the epidemic spread in hubei province excluding wuhan were 0.2084 (p<0.01) and 0.2513 (p<0.001), respectively. the q statistic values decreased in distance outside wuhan or hubei and showed that the determinant powers in mainland china excluding hubei were 0.1610 (p<0.001) and 0.1723 (p<0.001), respectively. in the first week of the outbreak, wuhan trajectories received higher values of q statistics than market trajectories in hubei province excluding wuhan and in provinces bordering hubei ( figures 4b and 4c) . the second-generation spread contributed more influence in the areas surrounding the epidemic source. however, both two categories had close q statistic values in mainland china excluding hubei ( figure 4d ). the epidemic outside hubei province appeared as a balanced pattern of mixed first-generation and second-generation spreads. furthermore, the q statistics increased constantly outside hubei province, indicating the increasing ssh of the epidemic spread ( figures 4c and 4d ). more attention should be given to control of the trend of second-generation spread and to eliminate potential third-generation spread. taking into consideration of the interaction influences of two categories of trajectories, the interaction q statistics were calculated in various areas (table 1) . all the interaction types were bienhancement which indicates that two determinant factors (i.e., two categories of trajectories originated from two epidemic sources) enhance each other (the interaction q statistic is higher than each single q statistic but lower than the sum of two single q statistics). the determinant powers and interactions of two categories of trajectories in introducing the epidemic spread decreased in distance from the source to the rest of the nation. the interaction q statistic was 0.1925 (compared to the single q statistics of 0.1610 and 0.1723) in mainland china excluding hubei. the interaction q statistic was 0.0786 (compared to the single q statistics of 0.0657 and 0.0642) in mainland china. although the interaction strength was weak, the combination of both trajectory categories still carried more information about the spread of the epidemic throughout the country. the interaction q statistic of two categories of trajectories in hubei province excluding wuhan was 0.4063, which was close to the sum of two single q statistics (0.2084 and 0.2513) and much higher than each one individually. this interaction indicates strong bi-enhancement in facilitating the spread of the epidemic. two categories of trajectories could significantly enhance each other to explain the ssh of the epidemic spread from wuhan to other areas in hubei province. the majority of the earliest cases of the covid-19 atypical pneumonia were linked to the seafood wholesale market in wuhan, which is the most severely-affected city of the covid-19 outbreak. the movements of populations from these two epidemic sources provided potential first-generation and second-generation spreads nationwide and worldwide. here, based on lbs-requesting mobile device traces and spatiotemporal confirmed covid-19 case data, we applied pearson correlation and geodetector q statistics to analyze the spatiotemporal association between the confirmed cases' dynamic and human movements. our findings provide important foundations to quantify the effect of human movement on the epidemic spread, to judge the epidemic generations, and to inform ongoing and future control strategies. we concentrated on two datasets of lbs-requesting mobile devices associated with two sources linked to the first-generation and second-generation spreads provincewide and nationwide. their traces were aggregated by date in administrative cities and linked to the spatiotemporal confirmed cases. it is notable that the covid-19 outbreak had a strong consistency with human migrations from the epidemic sources. the confirmed cases had a clear linear correlation with two categories of trajectories from the sources to the rest of the nation. moreover, both trajectory categories could generally indicate the epidemic spread in hubei province and explain to a certain extent the ssh of the spread from wuhan to the rest of hubei province and throughout the rest of china. our analyses provide a new perspective to explore the spread of the epidemics linked to human movement. during the first week of the large-scale outbreak, the epidemic spread showed a spatially distributed consistency with the population distribution in mainland china. the majority of cities with confirmed cases had a mixed pattern of first-generation and second-generation spreads at the very beginning of the outbreak. many cities activated quick response within 3-4 days and achieved efficient results in inhibiting the first-generation spread outside hubei province. however, it still had a significant impact in hubei province, especially playing the dominant role inside wuhan city. furthermore, among the other cities in hubei province, the first-generation and second-generation spreads enhanced each other with a much higher interaction q statistic. this might be another signal to identify the potential start date of the third-generation spread in a specific area. due to the quick response and strict control measures in many cities, the interaction enhancement of the firstgeneration and second-generation spreads had a weak strength outside hubei province. there is no evidence that any third-generation spread appeared outside hubei in mainland china in the first week of the outbreak. nevertheless, hubei's adjacent provinces require more effective control measures, since the first-generation and second-generation spreads had an increasing trend. our analyses determined an appropriate approach to explore the spatiotemporal association between the epidemic transmission and human movement. two categories of lbs-requesting mobile devices were used in this study to identify the potential close contacts to the primary and secondary epidemic sources. the datasets covered most devices with lbs requests in the given region and time period. however, the linkage between mobile devices and populations could be subject to information loss (e.g., users may replace their mobile devices with new ones). it is also extremely difficult to cover 100% potential close contacts in our datasets. the close contacts of these two populations while traveling before/after the outbreak were not collected, and therefore we cannot estimate the potential third-generation cases and their movements. this limitation involves future work with more universal-source data and high-performance computing capabilities. the covid-19 epidemic data were collected through publicly available sources, and we processed the data of confirmed cases and device traces in the spatial scale of cities. small-scale analyses could be more helpful to construct epidemic control programs in counties or communities within a city. the spatiotemporal association between the spread of the epidemic and human movements indicates a transfer from second to third generations of the infection. this approach has made it possible to assess the start date of the third-generation spreads of covid-19 epidemic and the interactions between first-generation and second-generation spreads across various regions all over the country. the proposed technique incorporating location-based service data of mobile devices can help identify the spatiotemporal generations at the early stage of the covid-19 epidemic. it can be easily implemented and extended to the early exploratory study of other epidemics similar to covid-19. the results indicate the spatiotemporal characteristics of the epidemic spread associated to human movements from epidemic sources and the potential spatiotemporal risks at the early stage of the outbreak. control measures varying by location and time could be executed in different levels for various regions. for instance, cities with obvious third-generation spread require the strictest controls on both the exportations and the inside quarantine, cities should pay more attention to the importations and the inside quarantine if the first-generation and second-generation spreads have the strong interactive enhancements, and other cities require to focus on the control of the importations. in conclusion, we found that the third-generation spread of the covid-19 outbreak probably started during january 17 to 20, 2020 in wuhan, the potential start date of the third-generation spread in hubei province excluding wuhan was from january 23 to 24, and the mixed first-generation and second-generation spreads strongly interacted to enhance the epidemic. the trend of the interactions between the first-generation and second-generation spreads was significant in the provinces adjacent to hubei. the associations between the epidemic spread decreased with distance and had different temporal pattens from the epidemic sources, implying the potential epidemic generation-togeneration evolution on regional spatial scales. at the very beginning of the outbreak, the mixed first-generation and second-generation spreads appeared in most cities with confirmed cases. no obvious third-generation spread was detected outside hubei province. the strict transportation measures implemented in many cities appeared to have been effective in preventing any thirdgeneration spread nationwide. the urgent control measures in hubei province include weakening the third-generation spread and the interaction influence of the first-generation and secondgeneration spreads. even with strict control strategies, effective measures to reduce transmission in the community are still required (li et al. 2020a) . a large increase in migration due to people returning from travel after the new year holiday also introduces challenges to epidemic control . we recommend the urgent control measures of preventing potential thirdgeneration spread in mainland china, eliminating it in hubei, and reducing the interaction influence of first-generation and second-generation spreads. no individual data was collected and the ethical approval or individual consent was not applicable. the lbs-requesting mobile device data were provided by wayz inc., shanghai, china and are not available for distribution due to the constraint in the consent. the dataset of the covid-19 cases is available from multiple public sources. this work was supported by the national natural science foundation of china (41531179) , the national science and technology major project of china (2016yfc1302504) and the science and technology major project of jiangxi province, china (2020ybbgw0007). the funders had no role in study design and conduct; data collection, management, analysis and interpretation; manuscript preparation, writing and review; decision to submit the manuscript for publication. conceptualization we declare no competing interests. public platform of the 2019-ncov-infected pneumonia epidemic the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak new sars-like virus in china triggers alarm first case of 2019 novel coronavirus in the united states early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia potential of large 'first generation' human-to-human transmission of 2019-ncov transmission dynamics and control of severe acute respiratory syndrome the sars, mers and novel coronavirus (covid-19) epidemics, the newest and biggest global health threats: what lessons have we learned? transmission dynamics of the etiological agent of sars in hong kong: impact of public health interventions transmission of 2019-ncov infection from an asymptomatic contact in germany a novel coronavirus outbreak of global health concern. the lancet what to do next to control the 2019-ncov epidemic? the lancet geographical detectors-based health risk assessment and its application in the neural tube defects study of the heshun region, china a measure of spatial stratified heterogeneity emergency committee regarding the outbreak of novel coronavirus (2019-ncov) nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study. the lancet mapping the increased minimum mortality temperatures in the context of global climate change a novel coronavirus from patients with pneumonia in china we thank dr. adam thomas devlin at the school of geography and environment, jiangxi normal university for the assistance in the proofreading work for the manuscript. j o u r n a l p r e -p r o o f key: cord-009688-kjx6cvzh authors: zhao, ze-yu; chen, qi; zhao, bin; hannah, mikah ngwanguong; wang, ning; wang, yu-xin; xuan, xian-fa; rui, jia; chu, mei-jie; yu, shan-shan; wang, yao; liu, xing-chun; an, ran; pan, li-li; chiang, yi-chen; su, yan-hua; zhao, ben-hua; chen, tian-mu title: relative transmissibility of shigellosis among male and female individuals: a modeling study in hubei province, china date: 2020-04-17 journal: infect dis poverty doi: 10.1186/s40249-020-00654-x sha: doc_id: 9688 cord_uid: kjx6cvzh background: developing countries exhibit a high disease burden from shigellosis. owing to the different incidences in males and females, this study aims to analyze the features involved in the transmission of shigellosis among male (subscript m) and female (subscript f) individuals using a newly developed sex-based model. methods: the data of reported shigellosis cases were collected from the china information system for disease control and prevention in hubei province from 2005 to 2017. a sex-based susceptible–exposed–infectious/asymptomatic–recovered (seiar) model was applied to explore the dataset, and a sex-age-based seiar model was applied in 2010 to explore the sexand age-specific transmissions. results: from 2005 to 2017, 130 770 shigellosis cases (including 73 981 male and 56 789 female cases) were reported in hubei province. the seiar model exhibited a significant fitting effect with the shigellosis data (p < 0.001). the median values of the shigellosis transmission were 2.3225 × 10(8) for sar(mm) (secondary attack rate from male to male), 2.5729 × 10(8) for sar(mf), 2.7630 × 10(-8) for sar(fm), and 2.1061 × 10(-8) for sar(ff). the top five mean values of the transmission relative rate in 2010 (where the subscript 1 was defined as male and age ≤ 5 years, 2 was male and age 6 to 59 years, 3 was male and age ≥ 60 years, 4 was female and age ≤ 5 years, 5 was female and age 6 to 59 years, and 6 was male and age ≥ 60 years) were 5.76 × 10(-8) for β(61), 5.32 × 10(-8) for β(31), 4.01 × 10(-8) for β(34), 7.52 × 10(-9) for β(62), and 6.04 × 10(-9) for β(64). conclusions: the transmissibility of shigellosis differed among male and female individuals. the transmissibility between the genders was higher than that within the genders, particularly female-to-male transmission. the most important route in children (age ≤ 5 years) was transmission from the elderly (age ≥ 60 years). therefore, the greatest interventions should be applied in females and the elderly. old [1] . according to the chinese center for disease control and prevention (china cdc), approximately 150 000 to 450 000 cases were reported annually within the period 2005 to 2014 [2] . although there have been an improvement in the quality of water and sanitation, shigellosis remains a major public health problem in several developing countries, including china [3, 4] . bacillary dysentery is an infectious intestinal disease that can be transmitted via the consumption of contaminated food or water [5] . humans are the only natural host for shigella spp.. in recent years, numerous reports have demonstrated that the incidence of shigellosis within males is higher than that within females [6] [7] [8] . the incidence of shigellosis, a water/food born disease, is directly related to the hygiene behaviours such as regular hand washing [9] . a study has indicated that the sanitary state in females is always higher than that in males [10] . does this mean that the transmission features differ between male and female? a study has reported that shigellosis primarily occurs from person-toperson [1] . thus, the water/food-to-person route has been interrupted. moreover, many studies have indicated different incidences in individuals of various ages [1, 8, 11] . in this study, we aimed to explore the interpersonal transmission further. in model studies of shigellosis, the distribution of time and space has been a greater focus than population-based research [12] [13] [14] [15] [16] . a study demonstrated that the susceptible-exposed-infectious/asymptomatic-recovered-water/food (seiarw) model exhibited a significant fitting effect with outbreak data in a school [17] . however, it did not estimate the transmissibility of bacillary dysentery between males and females. considering that water makes less of a contribution in the transmission, a sex-based susceptible-exposed-infectious/asymptomatic-recovered (seiar) model was applied to explore the dataset from hubei province. the secondary attack rate (sar), which is defined as the probability of an infected person infecting a susceptible person during his or her entire infectious period, was adopted to assess the relative transmissibility of shigellosis between males and females. in this study, shigellosis cases reported in hubei province, china, were collected. the seiar model was applied to fit the data, calculate the related index, and determine the transmissibility of shigellosis between males and females. with the aim of exploring the transmission features in different gender and age groups, the seiar model was adopted to fit the data of shigellosis cases reported from 2005 to 2017 in hubei province, china. a mathematical study was implemented using a sexand age-based model to analyze the transmission characteristics of reported shigellosis cases in hubei province, china, from 2005 to 2017. in this study, we divided the research process into three parts (fig. 1) . first, we developed the model according to the natural history and transmission mechanism in different genders. second, we acquired the model parameters by reference and curve fitting. finally, we adopted indicators to estimate the transmissibility in different genders and to explore the transmission features in different age groups further. the dataset of the shigellosis cases was collected from the china information system for disease control and prevention in hubei province from 2005 to 2017. the dataset included gender, age, occupation, address, date of onset, and date of diagnosis. in this study, people were divided into two groups according to gender. the information of the population, such as the birth rate, death rate and total population were obtained from the hubei statistical yearbook. the seiar model was developed according to the natural history of shigellosis among male and female individuals (fig. 2) . we used the subscripts m to represent male and f to represent female. the pattern followed by the model was person to person, which consisted of susceptible (s m , s f ), exposed (e m , e f ), symptomatic (i m , i f ), asymptomatic (a m , a f ) and recovered (r m , r f ) individuals. definitions of the epidemiological classes are summarized in table 1 . in the model, we assumed that: a) susceptible individuals of different genders become infected by contact with infected/asymptomatic people. b) the relative rate of transmission among male and female individuals is β mm and β ff , respectively. c) the relative rate of transmission from male to female is β mf and from female to male is β fm . moreover, we assumed that in both male and female: a) the disease does not spread vertically, and individuals born in various groups are all susceptible. the natural birth rate is br and the natural mortality rate is dr. b) according to a new review [1] , the transmission of shigellosis mainly occurs from person-to-person. meanwhile, our pilot study indicated a minor contribution of water/food (additional file 1). therefore, we assumed that the water/food to person transmission route had been cut off. c) the (1-p) e (0 ≤ p ≤ 1) number of exposed individuals will change to infected person i following an incubation period, while a further pe number of exposed individuals will become asymptomatic person a following a latent period (the period during which the exposed individuals become an asymptomatic person). d) the removal speed from i and a is positively proportional to the number of people in both groups, and the proportional coefficients are γ and γ', respectively, whereas 1/γ and 1/γ' are the infectious period of i and a. e) the infected person will die as a result of the disease and the case fatality rate is f. the model is expressed as follows: the left side of the equation indicates the instantaneous rate of change of s, e, i, a and r at time t. in the model, the sar was calculated as follows: considering that the transmissibility could relate to different ages (we considered three age groups based on the age distribution of the reported shigellosis incidences in the province), we divided individuals into six groups. the subscript 1 was defined as male and age ≤ 5 years, 2 was male and age 6 to 59 years, 3 was male and age ≥ 60 years, 4 was female and age ≤ 5 years, 5 was female and age 6 to 59 years, and 6 was male and age ≥ 60 years. thereafter, we constructed a sex-age-based seiar model. we calculated the ratios x, y, and z (from the results of sex-based seiar model) in four transmission routes of the different genders to increase the reliability of the estimated parameters. we set β ff as β 0 and the framework is presented in fig. 3 and its equation is provided in additional file 2. according to the reported incidence of shigellosis from 2005 to 2017 in hubei province, we selected the year 2010 to quantify the transmissibility in the different sex and age groups (fig. 4a) . meanwhile, we compared wuhan city with yichang city based on the different incidence in both cities of hubei province in 2010 (fig. 4b) . according to the epidemiological characteristics of shigellosis and our previous study [17] , we set k and γ' as 0.3125 and 0.0286, respectively. the proportions of asymptomatic individuals were reported to range from 0.0037 to 0.2700 [18] [19] [20] . we set p = 0.1 in the seiar model. the incubation of shigellosis was reported to range from 1 to 3 days [21] [22] [23] . therefore, we set ω as 0.3333 to 1.000. the symptoms generally last for 1 week, but certain people may experience symptoms for several weeks [24, 25] . we assumed the course of the disease was up to 3 weeks. therefore, we set γ as 0.0477 to 0.1428. the fatality rate of the disease reported in a study decreased from 0.00088 to 0.00031 from 1991 to 2000 [26] . considering that the fatality rate of shigellosis is extremely low [27] , we set f = 0. the values of β mm , β ff , β mf and β fm were generated by curve fitting using the seiar model and the reported shigellosis data. the definitions, ranges and sources of the parameters are displayed in table 2 . we performed a "knock-out" simulation to explore the roles of the different β values. the theory of the "knockout" simulation was come from originates from the gene "knock-out" technique (an experimental technique used in genetics in which a normal gene is replaced by a defective gene either at the exact same chromosomal sitehence, the normal gene is 'knocked out' by the defective gene-as occurs with the yeast genome, or the deoxyribonucleic acid is inserted at random sites, as occurs in [28] . in the model, we always estimated the contribution of one parameter by setting it to 0 to calculate the decreasing number of cases or total attack rate. for example, the contribution of the parameter β fm simulated by the model was the decreasing number of cases when we set it to 0. therefore, "knock-out" simulation (interrupting the different shigellosis transmission routes among males and females) was performed in five scenarios in our study: a) β mm = 0; b) β mf = 0; c) β ff = 0; d) β fm = 0; and e) control (no intervention). was employed for the model simulation. the simulation methods were as previously described [17, [29] [30] [31] [32] . according to our previous published studies [33, 34] , we assumed that heterogeneity of the transmissibility existed during an ascending trend and a descending trend. the annual data were therefore divided into numerous parts and the simulated time step was a day; for example, the data of 2010 were divided into 13 parts ( moreover, spss 21.0 (ibm corp, armonk, ny, usa) was used to calculate the coefficient of determination (r 2 ) by curve fitting, which was adopted to judge the model goodness of fit. because nine parameters, namely k, ω, γ, γ', p, br, dr, f and q, were obtained from references and the hubei statistical yearbook, uncertainty existed influence in the model. in our model, the nine parameters were split into 1000 values, as indicated in table 2 . considering that the simulated model method was the same in each year, we performed sensitivity analysis in 2010 (a middle reported incidence and case in fig. 4a ). the results of the curve fitting indicated that the seiar model fitted the data effectively (fig. 7) . the r 2 values of the seiar model for the different genders each year are presented in table 3 . in 2010, the reported data of all individual groups exhibited a significant fitting effect with simulated data in hubei province (fig. 8) , wuhan city, and yichang city (fig. 9 ). according to fig. 10 , the results of the "knock-out" simulation demonstrated that the number of cases in the different genders using the parameters β mm = 0, β ff = 0, β mf = 0 and β fm = 0 were lower than that in the control group. when β fm = 0, the number of cases decreased the most in the different genders. in 2010, a total of 12 340 cases were reported in hubei province (873 cases in yichang city and 5 899 cases in wuhan city). the "knock-out" simulation demonstrated similar results of the contribution in four transmission routes between wuhan and yichang city, but different results from hubei province (fig. 11) . fig. 12 presents the difference between the mean and 95% confidence interval (ci) from 2005 to 2017 when using β mm , β ff , β mf and β fm . the mean value was 1.9240 × 10 -9 (95% ci: 1.6621 × 10 -9 to 6.6121 × 10 -9 ) when using β mm , 1.5645 × 10 -9 (95% ci: 1.3521 × 10 -9 to 1.7769 × 10 9 ) when using β ff , 2.1572 × 10 -9 (95% ci: 1.9159 × 10 -9 to 2.3986 × 10 -9 ) when using β fm, and 1.8750 × 10 -9 (95% ci: 1.6846 × 10 -9 to 2.0654 × 10 -9 ) when using β mf . the results of the sar from 2005 to 2017 are presented in fig. 13 . the median value of sar mm was 2.32 ci: 3.23 × 10 -9 to 1.18 × 10 -8 ) and β 64 (mean: 6.04 × 10 -9 , 95% ci: 2.41 × 10 -9 to 9.67 × 10 -9 ). based on the 1000 times that the model ran, the model was not sensitive to the parameters br, dr, f, q and γ'. the number of cases set were the same for the mean, meanstandard deviation (sd) and mean + sd values (fig. 15 ). our model was slight sensitive with parameters ω, k and p (fig. 16a,b,c) . meanwhile, high sensitivity to parameter γ (0.0741) was demonstrated, as illustrated in fig. 16d . several mathematical models (such as the time-series susceptible-infectious-recovered and seiarw) have been established to determine the dynamics of shigellosis [17, 35] . however, our study is the first to clarify the fig. 11 the results to simulate the contribution of β during the transmission in different genders. a: male; b: female; β mm = 0, interrupt transmission among male; β ff = 0, interrupt transmission among female; β fm = 0, interrupt transmission from female to male; β mf = 0, interrupt transmission from male to female; none: control transmission of shigellosis between both genders globally. in this study, we used the seiar model to study the transmission of the water/food-borne infectious disease and explored the transmission routes in the different sex-age groups further. the results provide guiding significance for controlling the prevalence of shigellosis. according to r 2 of the linear regression, the seiar model exhibited a high goodness of fit with the reported data in the different genders. moreover, it was consistent with the results of previous research [17] , suggesting that the model was suitable for this study. according to the results of the sensitivity analysis, the model was more sensitive to parameter γ. therefore, the results would be more reliable if γ was collected from real data, instead of from the literature. in recent years, although the incidence of shigellosis exhibited a decreasing trend in china [6, 26, 36] , relatively high levels still occurred in hubei province from 2005 to 2017. different incidences of shigellosis cases in males and females were observed by the descriptive epidemiology [37, 38] . however, few clarifications of the causes of this difference and the transmission features have been provided. a study indicated that there were more cases in males than in females (the male-to-female ratio was 1.3:1), which is consistent with our results in the descriptive epidemiology [39] . the transmission pattern of shigellosis has shifted from water/food-to-person to person-to-person, with high risk groups being particularly men who have sex with other men (msm) in developed countries [1] . meanwhile, numerous studies have reported that the incidence in males is higher than that in female [6] [7] [8] . does this mean that the transmissibility of shigellosis among males is stronger than that among females? the seiar model was developed to verify this hypothesis. however, we obtained the number of cases in five hypotheses using "knock-out" simulation. when β fm = 0, the number of cases decreased the most in both genders, which means that female-to-male transmission contributed significantly during the transmission. therefore, it is important to isolate and treat female cases as well as to strengthen personal health. in this study, we modelled the reported data from two cities in hubei province. the results of the "knock-out" simulation demonstrated that the decreasing trend of wuhan city was similar to that of yichang city, but both exhibited a certain disparity fig. 12 the parameter of β mm , β ff , β mf and β fm during the transmission from 2005 to 2017 in hubei. a: β mm , transmission relative rate among male; b: β ff , transmission relative rate among female; c: β mf , transmission relative rate from male to female; d: β fm , transmission relative rate from female to male fig. 13 the sar mm , sar mf , sar fm and sar ff estimated by model from 2005 to 2017 in hubei. sar: secondary attack rate; subscript mm, among male; mf, from male to female; fm, from female to male; ff, among female compared to the results of hubei province. according to fig. 9 , there were differences in the cases reported from wuhan city and yichang city for 2010. both cities exhibited similar ascending and descending trends during each time for the same gender, but the results differed from those of hubei province. this could be related to the proportion of male and female cases reported daily. regional differences may not be the main influential factor for the incidences in terms of gender. compared to hiv which exhibits different transmissibility in different genders, shigellosis is not particularly highly contagious in the different genders [40] . our results demonstrated that the mean values of the transmission parameters among males and females, from male to female, and from female to male are differed, with the following order: β fm > β mm > β mf > β ff . the median values of the sar exhibited the following order: sar fm > sar mf > sar mm > sar ff . because a model of the total population in hubei was constructed, the value of sar was small and within the neighborhood of zero. however, this did not affect the quantification of the transmissibility of shigellosis. a previous study indicated a high incidence in msm in developed countries owing to unprotected sex and oro-anal contact [1] . however, the proportion of msm in china is not large. this finding may be related to the fact that the contact rate between males and females, such as kissing, embracing, and shaking hands, is higher than within genders. the results indicate that the most significant transmission route is from female to male. superior hygiene behaviours may be responsible for the lower female than male incidences. the greatest reason that males are more susceptible than females may be related to superior lifestyle habits, such as hand washing, in female individuals than in males. moreover, females generally carry out more tasks such as cooking in the home. this finding suggests the importance of emphasizing the importance of washing hands before cooking for females. the results of this study are consistent with those of most research [41, 42] , which have indicated a heavy disease burden in children under 6 years. there is no doubt that children have a relatively high susceptibility compared to other ages. furthermore, it is apparent that children often exhibit poor habits such as not washing their hands after using the toilet or before meals. our results demonstrate that the main transmission route is from the elderly to children. there is a custom in china whereby young parents leave their children in the grandparents' care. this suggests that the most important intervention may be the need to cut off transmission from the elderly. according to the epidemic characteristics of bacterial dysentery, control measures could be implemented in terms of following aspects: a) focus on females cooking in the home and grandparents caring for grandchildren, such as advocating hand washing. b) encourage effective hygiene habits to reduce the susceptibility of male individuals and children. c) reduce the frequency of social behaviour such as kissing, embracing and shaking hands. fig. 14 the transmission relative rate in different age and gender groups in 2010. β 0 : transmission relative rate within female; β ij refers to transmission relative rate of gender and age group from i to j, i and j represent subscript 1 to 6, subscript 1 was defined as male and ≤ 5 years old, 2 was male and between 6 to 59 years old, 3 was male and ≥ 60 years old, 4 was female and ≤ 5 years old, 5 was female and between 6 to 59 years old, and 6 was female and ≥ 60 years old; the data of 2010 were divided into 22 stages based on the following simulated periods, limitations several influential factors contributed to the year 2010 being considered for estimating the transmission features in the different age groups. it is possible that the transmission would vary according to changes in human behaviour. thus, further research is required to explore the transmission characteristics of hubei province. numerous studies have indicated that shigella consists of four species, namely dysenteriae, boydii, flexneri, and sonnei, among which the final two are the most common in low-and middle-income countries [36, 43, 44] . in our study, the dataset was obtained from routine infectious disease surveillance of the cdc in hubei province with no reported information regarding the shigella species. we believe that it is highly necessary to estimate the transmissibility in different shigella species. additional data for the different species will need to be collected for analysis. the results have been affected given that we supposed that β w = 0 in the seiar model and ignored environmental factors (such as water and food). moreover, owing to the limited availability of data, sociological components (for example, occupations, and cultural and societal backgrounds) were not considered in the model. additional data relating to sociological factors need to be collected for analysis. finally, the parameters of the seiar model were obtained from relevant references and the hubei statistical yearbook, and not from a firsthand data, which had an impact on the accuracy of our model. in hubei province, the incidence of shigellosis in males is higher than that in females. the transmissibility between the genders is higher than that within the genders, particularly female-to-male transmission. the main transmission route in children (age ≤ 5 years) is transmission from the elderly (age ≥ 60 years). therefore, the greatest interventions should be applied in females and the elderly. supplementary information accompanies this paper at https://doi.org/10. 1186/s40249-020-00654-x. additional file 1 the contribution of β w in seiarw model. additional file 2. sex-age based seiar model. cdc: center for disease control and prevention; seiarw: susceptible-exposed-infectious/asymptomatic-recovered-water/food; seiar: susceptible-exposed-infectious/asymptomatic-recovered; sar: secondary attack rate; ci: confidence interval; sd: standard deviation; msm: men who have sex with other men environmental drivers and predicted risk of bacillary dysentery in southwest china the global burden of diarrhoeal disease regional disparities in the burden of disease attributable to unsafe water and poor sanitation in china. b world health organ multistate shigellosis outbreak and commercially prepared food, united states spatiotemporal characteristics of bacillary dysentery from 2005 to 2017 in zhejiang province spatial-temporal pattern and risk factor analysis of bacillary dysentery in the beijing-tianjin-tangshan urban region of china the changing epidemiology of bacillary dysentery and characteristics of antimicrobial resistance of shigella isolated in china from gender and the hygiene hypothesis risk factors for shigellosis in thailand effects of ambient temperature on bacillary dysentery: a multi-city analysis in anhui province impact of meteorological factors on the incidence of bacillary dysentery in beijing, china: a time series analysis meteorological variables and bacillary dysentery cases in changsha city, china patterns of bacillary dysentery in china socio-economic factors of bacillary dysentery based on spatial correlation analysis in guangxi province spatiotemporal risk of bacillary dysentery and sensitivity to meteorological factors in hunan province investigation of key interventions for shigellosis outbreak control in china risk factors for secondary transmission of shigella infection within households: implications for current prevention policy detection of intra-familial transmission of shigella infection using conventional serotyping and pulsed-field gel electrophoresis asymptomatic salmonella, shigella and intestinal parasites among primary school children in the eastern province world health organization. foodborne disease outbreaks, guidelines for investigation and control. geneva: who an outbreak of foodborne infection caused by shigella sonnei in west bengal, india a school outbreak of shigella sonnei infection in china: clinical features, antibiotic susceptibility and molecular epidemiology a brief history of shigella prevention cfdca: shigella -shigellosis. 2020 trend and disease burden of bacillary dysentery in china global burden of shigella infections: implications for vaccine development and implementation of control strategies gene knockout technique. 2020 the effectiveness of age-specific isolation policies on epidemics of influenza a (h1n1) in a large city in central south china evidence-based interventions of norovirus outbreaks in china risk of imported ebola virus disease in china simulation of key interventions for seasonal influenza outbreak control at school in changsha estimating the transmissibility of hand, foot, and mouth disease by a dynamic model transmissibility of acute haemorrhagic conjunctivitis in small-scale outbreaks in hunan province dynamics of shigellosis epidemics: estimating individual-level transmission and reporting rates from national epidemiologic data sets an 11-year study of shigellosis and shigella species in taiyuan, china: active surveillance, epidemic characteristics, and molecular serotyping the epidemiological influence of climatic factors on shigellosis incidence rates in korea spatial-temporal detection of risk factors for bacillary dysentery in beijing identifying high-risk areas of bacillary dysentery and associated meteorological factors in wuhan mathematical models for hiv transmission dynamics: tools for social and behavioral science research burden and aetiology of diarrhoeal disease in infants and young children in developing countries (the global enteric multicenter study, gems): a prospective, case-control study use of quantitative molecular diagnostic methods to identify causes of diarrhoea in children: a reanalysis of the gems case-control study shift in serotype distribution of shigella species in china university of texas medical branch at galveston. 1996; chapter 22 we thank the staff members in the hospitals, local health departments, and local cdcs for their valuable assistance in coordinating data collection. authors' contributions tc, bz, and zz designed the study. qc collected data. tc, zz, qc, bz, nw, yw, xx, jr, sy, mc, yw, xl, ra, lp, and ys and performed the analysis. tc, zz, nw, and mnh wrote the first draft of this paper. all authors contributed to the writing of the manuscript. the author(s) read and approved the final manuscript. availability of data and materials extra data is available by emailing to dr. qi chen (317342267@qq.com) on reasonable request. this effort of disease control was part of cdc's routine responsibility in hubei province, china. therefore, institutional review and informed consent were not required for this study. all data analysed were anonymized. not applicable. key: cord-321727-xyowl659 authors: wang, lishi; li, jing; guo, sumin; xie, ning; yao, lan; cao, yanhong; day, sara w.; howard, scott c.; graff, j. carolyn; gu, tianshu; ji, jiafu; gu, weikuan; sun, dianjun title: real-time estimation and prediction of mortality caused by covid-19 with patient information based algorithm date: 2020-07-20 journal: sci total environ doi: 10.1016/j.scitotenv.2020.138394 sha: doc_id: 321727 cord_uid: xyowl659 the global covid-19 outbreak is worrisome both for its high rate of spread, and the high case fatality rate reported by early studies and now in italy. we report a new methodology, the patient information based algorithm (piba), for estimating the death rate of a disease in real-time using publicly available data collected during an outbreak. piba estimated the death rate based on data of the patients in wuhan and then in other cities throughout china. the estimated days from hospital admission to death was 13 (standard deviation (sd), 6 days). the death rates based on piba were used to predict the daily numbers of deaths since the week of february 25, 2020, in china overall, hubei province, wuhan city, and the rest of the country except hubei province. the death rate of covid-19 ranges from 0.75% to 3% and may decrease in the future. the results showed that the real death numbers had fallen into the predicted ranges. in addition, using the preliminary data from china, the piba method was successfully used to estimate the death rate and predict the death numbers of the korean population. in conclusion, piba can be used to efficiently estimate the death rate of a new infectious disease in real-time and to predict future deaths. the spread of 2019-ncov and its case fatality rate may vary in regions with different climates and temperatures from hubei and wuhan. piba model can be built based on known information of early patients in different countries. • the mortality rate determines whether a highly infectious disease becomes a public concern. • summarizing information after the fact does not contribute to real-time readiness to deal with the disease. • the patient information based algorithm (piba) estimates the death rate of a disease in real-time. • piba can be used to estimate the death rate of a new infectious disease in real time and to predict future deaths. a b s t r a c t a r t i c l e i n f o the mortality rate is the most important factor that determines whether a highly infectious disease becomes a public concern and carries risks causing a pandemic. different virus epidemics take place throughout the world every year, but only a few rise to the level of public concern (schlagenhauf and ashra, 2003; viboud and simonsen, 2012; who ebola response team, 2014) . severe acute respiratory syndrome (sars), swine influenza a h1n1 virus (h1n1), and zaire ebolavirus (ebola) brought on the public's attention because they caused many severe infections and thousands of deaths (dawood et al., 2012; nicholls et al., 2003; who ebola response team, 2014) . similarly, the disease covid-19 caused by a coronavirus (2019-ncov) brought world-wide attention and caused public panic because many deaths had been reported without being put in the context of the many mild infections and its potentially low case fatality rate (chan et al., 2020; huang et al., 2020; wang et al., 2020; wu et al., 2020) . for example, despite being a common infection, influenza rarely causes public concern because even though it is common, it leads to death in only 0.1% of cases. a variety of reports indicate that 2019-ncov is highly infectious through multiple routes huang et al., 2020; wu et al., 2020) . while the high infection rate is certain, the mortality rate of covid-19 has not been definitively determined. it is reasonable to suspect that the deaths of six of the first 41 patients (15%) in wuhan (huang et al., 2020) in the earliest reports by chinese scholars were inaccurate. when the initial mortality rates were reported, only patients who were critically ill were included. patients with mild symptoms, as well as those with asymptomatic infections, were not analyzed huang et al., 2020; wu et al., 2020) . case-fatality rates reported by huang et al. (2020) analyzed a skewed patient sample since it included only a small number of patients who had been transferred from other hospitals due to their critical condition. therefore huang et al.'s sample was skewed towards a concentration of severely ill patients, while the general patient population includes more patients with covid-19 who are asymptomatic or only have mild symptoms and who have not been hospitalized. chen et al. (2020) reported an 11% death rate, again based on patients with severe conditions. we have estimated the mortality rate using a patient information based algorithm (piba). the piba uses patient data in real-time to build a model that estimates and predicts death rates for the near future. piba uses data of patients identified early in the disease process to calculate the average number of days from hospitalization to death for those hospitalized. another feature is to take into account variations based on mathematical models. the piba calculation method does not divide the total number of patients on a day by the number of deaths on the same day. instead, the piba calculation method divides the number of deaths on that day by the number of possible patients of a day or days when the patients have just begun to develop the disease. thus, piba comprehensively and reasonably estimates the mortality rate based on the actual number of deaths and estimates the number of patients on a specific day. as time goes on, large amounts of data from northern and southern china have been accumulated through continuous reporting, all of which are used by piba, which then becomes more accurate as data accumulates. we conclude that it is time to utilize the accumulated data to estimate the case fatality rate of covid-19 infection. based on national data from the china national health center, the covid-19 death rate is much lower than that reported in huang et al. (2020) . holistic data covering all of wuhan, the epicenter city of covid-19, also indicates a death rate lower than that reported by huang et al. these data sources cover a larger patient sample, and include patients displaying symptoms with varying levels of severity. therefore, the updated estimation of the death rate should reference these larger scale and more representative data. our study contributes to knowledge on covid-19 death rate by building on huang et al.'s (2020) estimation and available data from official websites and addressing the limitations with a larger and more representative sample. 2.1. steps for estimating and predicting mortality using piba 1) to collect data from the patient's initial admission to death. strive to collect data for a certain number of patients. 2) to calculate the average number of days (μ) from hospital admission to the death and the number of days between one standard deviation (μ ± σ) interval and two standard deviations (μ ± 2σ). 3) to use these parameters (μ, μ ± σ, μ ± 2σ) to calculate the daily mortality during the epidemic. 4) to predict the mortality of infectious diseases in the future based on the calculated known mortality combined with the number of patients in a region. the predicted numbers are compared with real mortality to test and correct model data. 5) to conduct following-up modification of the piba model according to different nationalities and regions. in particular, the initial patient data collected may vary significantly from country to country, one ethnic group to the other, and region to region. the calculation based on the number of deaths and the number of patients on the same day does not reflect the real death rate because most patients with covid-19 do not die on the same day that they entered the hospital (chan et al., 2020; huang et al., 2020; wang et al., 2020) . with the piba method, we recognize that the patient population size was inaccurate in the early days but trust the published information of patients who died right after covid-19 outbreaks. the estimation is built upon data from patients with a normal distribution model. based on information about patients in wuhan who died during the period between dec 16, 2019, to jan 2, 2020 (huang et al., 2020) , two parameters were used to estimate days from onset of symptoms to death and days from admission to the intensive care unit (icu) to death. these two parameters are adopted in the estimation and prediction of covid-19 death rate. each parameter has five values including the mean, μ, one standard deviation from the mean, μ ± σ, and two standard deviations from the mean, μ ± 2σ. we collected data from covid-19 patients in china from three public websites. the data from the whole country are collected and made available on the official website of the health emergency office of the national health commission of the people's republic of china at http://www.nhc.gov.cn/yjb/new_index.shtml. the data from hubei province and wuhan are from the health commission of hubei province at http://wjw.hubei.gov.cn/fbjd/dtyw/. these data include the number of patients with covid-19 who were confirmed as having the disease, who died from the disease, whose condition was severe, and who were admitted to the hospital or icu. other collected data included daily new cases, new deaths, people who were in close contact with an infection source, and accumulated number of patients. we paid particular attention to data from wuhan, plus two additional cities in hubei province, xiaogan, and huanggang, in which the number of patients was higher than in other cities in hubei province. information from a northern province, heilongjiang province, was collected from the official website of outbreak information of the health commission of heilongjiang province at http://wsjkw.hlj.gov.cn/index. php/home/zwgk/all/typeid/42. data of heilongjiang province and harbin city were included because the province is located in the northern high-altitude zone. these data are used to assess whether the covid-19 is more, less, or equally likely to spread to an area with a cold climate. collected information included numbers of patients and numbers of deaths from each city and in the whole province. for any missing data in any day, a formula was used to estimate the data in that day: ni = {(n(i + j) + (n(i − j)) / (j + 1)} + (n(i − j), where ni = the estimated value of the missing data of the day i. j is the number of days of missing data, usually is 1; in the rare case, data of two consecutive days may be missing. if the data of two days are missing, the first day will be considered as the day i, the second day n (i + 1) will be calculated as n(i + 1) = ni + {(n(i + j) + (n(i − j)) / (j + 1)}. based on the days between confirmation of covid-19 and the days of death in the hospital, calculated from wuhan, as mentioned in method 1 and information from the whole country and hubei province, we tested the number of days from diagnosis to death, that most likely reflects the actual death rate. the estimated days are used to estimate the death rate using data from hubei province and wuhan city with the five values from above (μ, μ ± σ, μ ± 2σ). in consideration of the contribution of a variety of sources for the estimation, we fractured the data from (μ, μ ± σ and μ ± 2σ) into the piba and built the testing model as follows. 1) m i = (d i − d i−1 ) / (p i−n − p i−n−1 ) (death rate at increments) 2) m i = d i / p i−n (death rate at accumulative numbers) where m i = mortality rate, d i = the cumulative numbers of deaths on day i, p i = the cumulative numbers of patients on day i , i = the current day for calculating the death rate, n = the number of days from severe infection to death. when we considered these five partial values in normal distribution as a good indicator with a width of one standard deviation, each one of the five death rates calculated above on each day would have its own weight as the possible normal distribution (μ = 38.2%, μ − σ = μ + σ = 24.2%, and μ + 2σ = μ − 2σ = 6.7%). from here, we could give the death rate for every single day just a single value that results from the weighted average of all five cohorts of patients, as defined by time from severe illness to death. the equation is as follows: where d = death rate, mμ = mortality rate with μ days, wμ = weight with μ days gap, μ = mean in normal distribution, σ = standard deviation. 2.5. confirmation of the best estimation of the days to calculate the death rate in the other cities the same formula was then used to estimate the death rate from the other two cities in hubei province, namely xiaogan and huanggang. the piba model was developed using data from hubei province, including a. distribution of days between disease symptoms and death and between time of icu admission and death. vertical axis: days, horizontal axis: cases. b. estimated days from first symptoms to death and days from icu admission to death. c. lagging days (days from first symptoms to the day of death), μ, μ ± σ and μ ± 2σ and their weight (in percentages) used for the estimation of death rate in the broader patient population. note: among these values above, the lagging day μ − 2σ from symptom confirmation to death in panel b that equals to −3 has been set to 0. wuhan, xiaogan, and huanggang, and was further validated using data from heilongjiang province and harbin city. piba was then used to predict trends in new number of deaths. in order to further test the validity of our piba method in predicting actual mortality, we used a combination of the curve trend data and the overall mortality rate of the country, hubei, wuhan, and the rest of the country (china overall except hubei). based on our prediction of the days from actual hospitalization to death, we separately predicted the number of deaths in each day of the coming week. that is, from the comprehensive information of the number of new patients on the seventh day, the 13th day, and the 19th day before the targeted prediction day, we obtained three numbers of deaths for each of the predicted days. then from three of these numbers, the lower and upper values of the number of deaths on that day are used as the minimum and the maximum number of predicted deaths on that day, respectively. also, the same formula was used to predict the death number of a week in south korea. using information published by wuhan, we calculated the days between icu admission and death. we obtained the actual data from 33 patients who died in the hospital in wuhan. the days from onset of symptoms to deaths ranged from 6 to 30 (see fig. 1a ). from icu intake to death, the shortest number is one day, and the longest is 22 days. we derived two parameters, each from the 33 death cases, i.e., the days from onset of symptoms to death and the days from inpatient admission to death. since there are six patients out of these 33 death cases who have the same date of symptoms' appearance and inpatient, there were 33 values in the dataset related to inpatient and 27 values in another dataset related to symptoms' appearance (fig. 1a) . the results indicated that the average time from onset of the symptoms to death is 13 days (m = 13, s.d. = 6) (see fig. 1b ). accordingly, the lagging days from the day of death and their weight in the calculation of death rate were derived based on the new inpatient days (fig. 1c) . the prediction of death rate is based on data from wuhan city in which patients diagnosed with covid-19 had been confirmed since january 19, 2020 and where deaths had occurred, which were among the first confirmed cases of coronavirus. 3.2. estimated death rate for the whole country and hubei province using piba formula according to our five estimation parameters, from illness (i.e., symptom appearance) to death, the maximum number of days is 25 days. the earliest reported data in wuhan was published on january 19, 2020. based on these data, we were able to calculate the mortality rate from february 8, 2020, to the present. however, on february 12, the national health committee revised the data again (see appendix table 1 ). because of this amendment, the number of confirmed cases appeared to have changed significantly in only one day. we chose the calculation results from february 14 up to february 25 (appendix table 2 ), considering that the death rates on february 12 and february 13 are likely distorted by this sharp rise within a short term. fig. 2a through d provide information about the overall death rates in mainland china (hereafter referred to as country), hubei, wuhan, and rest of country (excluding hubei) (appendix table 3 ). we noticed that the death rate at increments based on piba in the whole country (in blue) in fig. 2a is below 10%, with most values between 2.7% and 6% in the last five days. the death rate in hubei province is similar to that of the whole country because 90% of the patients in the whole country were from hubei province (see appendix table 1 ) (fig. 2b) . in wuhan, the accumulated death rate was still high, as much as 20% (fig. 2d ). when we used the data from the rest of the country to test our piba formula, as expected, the curve is different from the curves from hubei and wuhan. unlike in hubei and wuhan, the death rate of the rest of the country is much lower and stable, mostly lower than 1% (fig. 2c) . the predicted death rate will remain between 1% and 2% for the near future. xiaogan and huanggang are the two cities in hubei province. the number of patients with covid-19 in these two cities is higher than in other cities in hubei except wuhan. they also are the cities with the largest number of patients with covid-19 in china. we, therefore, tested the piba formula using data from these two cities. currently, the death rate based on the increment data is around 3%, lower than that in wuhan but higher than that in the rest of the country. however, according to piba, the rate of deaths may decrease in the near future. heilongjiang province, including its capital city, harbin, is the province outside of hubei with the largest number of diagnosed patients. harbin city is located in the northeast of china and is in the coldest area in china. no patients from harbin city or the heilongjiang province were reported during the sars epidemic period. we used the piba formula to estimate the death rate in both the heilongjiang province (fig. 3c) and harbin city (fig. 3d) . the death rate of harbin decreased sharply in the past several days, into 0%. the low rate of less than 1% will possibly remain for the future. based on the piba and the death rate of accumulated numbers, the expected final death rate of the whole country, hubei, wuhan, and rest of the country except hubei, is predicted as follows (see table 1 ). the predicted values are from the intersection points between the incremental estimation and net values estimation. we used the predicted death rate to calculate the potential number of deaths per day in the coming week. because our initial estimation on the lagging days between inpatient and death was only based on 33 fig. 3 . death rate estimations of four places. the blue curve represents the mortality calculated by the actual increase in deaths per lagging day divided by the increase in actual patients on the previous corresponding day. the gray curve represents the total number of deaths per lagging day, divided by the total number of identified actual patients on the corresponding previous day. the orange curve shows the number of deaths per day divided by the total number of patients the same day. numbers on the vertical axis represent the death rate; on the horizontal axis is the date. a. the death rate of xiaogan city in hubei province b. death rate of huanggang city in hubei province. c. the death rate in heilongjiang province. d. the death rate in harbin city. patients, we, therefore, used the days of average 13 days plus (19 days) and minus one standard deviation (7 days) as the range of number of deaths on a given day in the coming week (see appendix table 4 . predicted number of deaths in the days of the coming week after february 25, 2020). as shown in fig. 4 , the actual number of deaths in the past four days fell into the predicted range. in the country (fig. 4a) , hubei (fig. 4b) , and wuhan (fig. 4c) , the numbers of actual death were near the predicted minimum numbers. while, for the rest of the regions of the country except hubei, the actual death data fluctuates between the predicted maximum and minimum values (fig. 4c) . due to the number of newly infected patients dropping in the last few days, the total number of patients tends to be constant or even less in the coming days if unexpected events do not occur. the peaks in these figures reflect sudden changes in numbers of patients (see fig. 4 ). we believe that the intersecting point between the trendlines could reasonably be considered one of the rates in its range of the death rate of patients infected in the future. as shown in the data above, the incidence in mainland china's provinces and cities was basically zero in late middle march. because of this, we were not able to prove the feasibility of this method in more regions in mainland china. however, because the environment, medical conditions, and population races in different countries are different, to test the usefulness of the piba model in other countries, we need to get the basic information of the initial population. this information includes the specific number of days from onset to death of a reasonable number of patients in different regions of different countries. at present, we could not access these data accurately. the only thing we can do is to test asian countries such as south korea and japan based on their ethnic similarities with populations in china. taking all aspects into consideration, we believe that south korea's data are more reliable. therefore, fig. 4 . comparison between the predicted number of deaths based on piba and the actual number of deaths. the blue color represents the estimated minimum number of deaths line. the orange color represents the estimated maximum number of deaths line. the gray line represents the actual number of deaths. panels a, b, c, and d showed these death numbers in the country, hubei, wuhan and the rest of country except hubei. we further tested our model using the affected population in south korea. as shown in fig. 5 , the trend of deaths in south korea in recent days is consistent with our prediction. first, piba is capable of accurately estimating the disease mortality and the number of future deaths. this real-time accurate prediction and estimation of disease mortality provide the public, government, and society with more accurate disease information. based on currently available data that includes patients with varying degrees of severity, the estimated prediction of the mortality rate of covid-19 is less than 3%, and less than the prior prediction based on limited available data. this finding may ease public concern and panic. updated scientific findings will be widely disseminated to broaden public awareness and contribute to helping fight covid-19. the medical, clinical, and research community should strive to publish scientifically rigorous findings related to urgent public health issues. publishing findings based on the availability of limited data contributes to unnecessary public concern and government action. in this particular case, the first report on the estimation of coronavirus death rate is an applaudable effort. however, it also had the limitations of a skewed dataset that focused on patients who were transferred from local hospitals because of their critical condition while excluding patients with less severe symptoms who remained at local hospitals. as soon as more data are available, we should provide updated reports and introduce improved estimation and prediction algorithms. this study indicates that as the number of transmissions of 2019ncov increases among the human population, its lethality will gradually decrease. indeed, the reasons are not necessarily all because of their reduced toxicity. there may also be improvements in treatments and implementation of early detection methods. therefore, a real-time estimate of death rate using patient information such as the piba method would demonstrate an appreciation of the importance of public and societal awareness. a critical issue to consider is that if the mortality rate of the covid-19 in a certain area is relatively high, the covid-19 in the area is still spreading and endemic. one of the most obvious questions is why the fig. 5 . test piba model using covid-19 population from south korea. a. estimation of death rate in the korean population using the piba method. the blue curve represents the mortality calculated by the actual increase in deaths per lagging day divided by the increase in actual patients on the previous corresponding day. the gray curve represents the total number of deaths per lagging day, divided by the total number of identified actual patients on the corresponding previous day. the orange curve shows the number of deaths per day divided by the total number of patients the same day. the number on the vertical bar represents the death rate, number on the horizontal bar shows the date. b. comparison between the predicted number of deaths based on piba and the actual number of deaths. the blue color represents the estimated minimum number of deaths line. the orange color represents the estimated maximum number of deaths line. the gray line represents the actual deaths. mortality rate in wuhan is considerably higher than in other places. based on our assessment, wuhan's medical equipment and rescue measures are comparable with other areas in china, and the pathogenicity of the virus is similar. we conclude that there is a large proportion of patients in wuhan who have mild illness and not been hospitalized at all. due to the uncertainty of the movement of infected people in the early stages of the onset, these mildly ill people move around in wuhan unidentified. this problem reminds other parts of the world that if the fatality rate of the covid-19 is found to be high, a large number of infected people have not been able to be identified or diagnosed. therefore, the work of controlling and isolating this infected group has not been completed, and the disease is still spreading and circulating in the area. the data on heilongjiang province and harbin show that, unlike some experts' predictions (cf. https://news.ifeng.com/c/7uhmhxcfhmq), it will occur more intensely in the high-altitude regions with a cold climate, and the mortality rate will be higher. with the development of the generations of 2019-ncov, its toxicity will gradually weaken, and we expect that the mortality rate in the cold northern regions will not increase, nor will it exceed that in wuhan or hubei province. our research has limitations, mainly due to available data. first, the estimation of number of patients from the date of hospital admission or icu intake to the date of death is based on data from official public websites. information from 33 individuals was estimated. if the information had been available regarding more patients, the initial estimate would have been more accurate. the second aspect is the accuracy of the number of patients diagnosed and the number of hospitalizations per day. due to the back and forth revision and correction of the data as announced by the official sources, we are not confident that all the data are error-free; however, we feel that these data as a whole are reliable. the third limitation of the piba method is that it depends on accurate patient information at the beginning of the epidemic. depending on different situations from different countries or regions, this information may or may not be available, or the information may not be accurate. the piba model accurately predicted a case fatality of 1.6% for symptomatic patients in china at a very early stage in the covid-19 pandemic. the model can be generalized to predict case fatality for any infection (including asymptomat), to predict the rate of severe disease, and to predict the death rate for patients who develop severe disease. these early, accurate predictions inform the public, society, and governments to estimate the extent of the disease's harm and to develop suitable strategies. supplementary data to this article can be found online at https://doi. org/10.1016/j.scitotenv.2020.138394. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study estimated global mortality associated with the first 12 months of 2009 pandemic influenza a h1n1 virus circulation: a modelling study clinical features of patients infected with 2019 novel coronavirus in wuhan lung pathology of fatal severe acute respiratory syndrome severe acute respiratory syndrome spreads worldwide global mortality of 2009 pandemic influenza a h1n1 a novel coronavirus outbreak of global health concern ebola virus disease in west africa-the first 9 months of the epidemic and forward projections nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study this work was partially supported by funding from merit grant i01 bx000671 to wg from the department of veterans affairs and the veterans administration medical center in memphis, tn, usa and grant 90dduc0058 to cg from u.s. department of health and human services, administration for community living. revise and approve and manuscript: all authors. all the data of patients in this study are from official public websites. key: cord-313675-fsjze3t2 authors: aslan, ibrahim halil; demir, mahir; wise, michael morgan; lenhart, suzanne title: modeling covid-19: forecasting and analyzing the dynamics of the outbreak in hubei and turkey date: 2020-04-15 journal: nan doi: 10.1101/2020.04.11.20061952 sha: doc_id: 313675 cord_uid: fsjze3t2 as the pandemic of coronavirus disease 2019 (covid-19) rages throughout the world, accurate modeling of the dynamics thereof is essential. however, since the availability and quality of data varies dramatically from region to region, accurate modeling directly from a global perspective is difficult, if not altogether impossible. nevertheless, via local data collected by certain regions, it is possible to develop accurate local prediction tools, which may be coupled to develop global models. in this study, we analyze the dynamics of local outbreaks of covid-19 via a coupled system of ordinary differential equations (odes). utilizing the large amount of data available from the ebbing outbreak in hubei, china as a testbed, we estimate the basic reproductive number, r0 of covid-19 and predict the total cases, total deaths, and other features of the hubei outbreak with a high level of accuracy. through numerical experiments, we observe the effects of quarantine, social distancing, and covid-19 testing on the dynamics of the outbreak. using knowledge gleaned from the hubei outbreak, we apply our model to analyze the dynamics of outbreak in turkey. we provide forecasts for the peak of the outbreak and the total number of cases/deaths in turkey, for varying levels of social distancing, quarantine, and covid-19 testing. in late 2019, the city of wuhan in the province of hubei, china experienced an outbreak of coronavirus disease 2019 , the disease caused by the novel coronavirus sars coronavirus 2 (sars-cov-2). this outbreak quickly spread to all states of china and across the globe, being declared a pandemic by the world health organization (who) on 11 march 2020. the authorities imposed a strict lock-down on the city of wuhan and other cities of the hubei province on 23 january 2020 (world health organization, 2020b) . in the face of over sixty-seven thousand cases and over three thousand deaths, the authorities continued strict enforcement of these measures (chinese physicians, 2020; coronavirus covid-19 global cases by johns hopkins csse, 2020). finally, on 23 march 2020, hubei reached a significant milestone as the province's health commission reported no new cases for seven consecutive days (world health organization, 2020b; the new york times, 2020) . shortly thereafter, after over two months of severe restrictions on the movements of the hubei population, the "2020 hubei lockdowns" were relaxed as the hubei outbreak began to wane, inspiring hope that the global pandemic might be able to be controlled. previous studies of covid-19 provided the evidence of human-to-human transmission and revealed its similarity and differences from sars (chan et al., 2020; huang et al., 2020; xu et al., 2020) . however, data-driven simulation-based studies are needed to understand the dynamics of the ongoing outbreak. indeed, it is of the utmost importance to use these tools to investigate the effectiveness of public health strategies, such as the number of covid-19 tests carried out to detect the infected, the level of quarantine/social distancing, and its efficiency in the transmission of studies investigate dynamics of this pandemic from a global perspective (see, e.g., (imai, dorigatti, cori, riley, & ferguson, 2020; read, bridgen, cummings, ho, & jewell, 2020; riou & althaus, 2020; shen, peng, xiao, & zhang, 2020; zhao et al., 2020; cao et al., 2020) ). nevertheless, the large variations in both quality and availability of data from region to region make direct global modeling of the dynamics of this pandemic exceedingly difficult. as a result, in this study, we develop a model for dynamics of the pandemic from a local perspective. the many "hotspots" of covid-19 combined with the many travel restrictions in place throughout the world further suggest that local models might provide more practical insights into the dynamics than their global counterparts. indeed, it stands to reason that accurate models for local regions can be coupled to develop reasonable models for larger regions. as of 23 march 2020, around one-quarter of the global covid-19 cases and consequent deaths occurred in hubei. the large proportion of data available from hubei combined with the region's recent achievements toward managing their local outbreak suggest that the data from this region presents an excellent picture of the lifetime of an outbreak of covid-19. indeed, as countries worldwide close their borders, cities and regions, and impose their own "shelterin-place," quarantine, or lockdown orders in the face of the pandemic, the large amount of data available from hubei provides an excellent testbed for modeling the dynamics of a local outbreak of covid-19. in this study, we start by developing a seiqr type deterministic model which uses a system of ordinary differential equations to analyze the dynamics of the outbreak, in particular highlighting the effect of testing and the effects of quarantine and social distancing in hubei. we present estimates of the basic reproductive number r 0 of covid-19 in hubei and perform a sensitivity analysis to deduce which parameters play significant roles in the transmission and control of the outbreak in hubei. in addition, we also provide 15-day forecasts of the fatality rate of the outbreak, the number of cases, and the number of deaths depending on the data (chinese physicians, 2020; coronavirus covid-19 global cases by johns hopkins csse, 2020; world health organization, 2020b) and outputs of our seiqr model. finally, building on knowledge obtained from the hubei outbreak, we apply our model to the outbreak in turkey. we forecast the peak of the outbreak and the total number of cases/deaths in turkey, utilizing the extant covid-19 data from turkey ((ministry of health (turkey), 2020)). a deterministic compartmental model has been developed by using ordinary differential equations (odes) to understand the dynamics of covid-19 in hubei, china (chubb & jacobsen, 2010; keeling & rohani, 2008; kot, 2001 ). in the model, the total population n (t) at time t is divided into the following six compartments: susceptible s(t), susceptible 3 in quarantine (isolated class) s q (t), exposed e(t), infected (asymptomatic or having mild symptoms) i(t), reported (infected) cases (hospitalized if get severe symptoms or quarantined if get mild symptoms) i q (t), and recovered r(t). note that all individuals who, upon testing, test positive are immediately isolated. the transition flows among compartments are given in figure 1 . the rate of reported cases i q denotes the number of individuals who transition from the infected class i to the reported class i q per day; it is also directly related to the daily number of covid-19 tests carried out during the outbreak. figure 1 : flow diagram illustrating the disease transitions among the compartments susceptible individuals make the transition to the s q (t) compartment with a rate of note that the main indicator of quarantine is the number of reported cases i q . when the number of reported cases increases in a state or country, then the quarantine is imposed or naturally taken as an option. thus, when the number of reported cases increases, then percentage or amount of people quarantined will increase. if the number of reported cases falls to zero, the transition rate from s to s q is zero and from s q to s is q s . the individuals in s and s q compartments transition to compartment e (exposed) with a force of infection given by and disease transmission rate β. note that, since the individuals in s q transition to e compartment less frequently, a reduction factor r is taken into count in the model. after an incubation period of 1/α, the individuals in e compartment transition to i compartment (infected) with rate α. the individuals in i compartment will either transition to r compartment (recovered) with a rate of γ i or i q compartment with a rate of i q , or die due to the disease with a rate of µ i . the individuals in i q (reported (infected) individuals, who are hospitalized or quarantined) compartment either transition to r compartment with a rate of γ q or die due to disease with a rate of µ q . the following (odes) system represents dynamical behavior of the system. . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 /2020 the left hand side of the system (1) represents the rate of change per day. in the system (1), we have hence, the feasible region of the system (1) is given by this implies all the compartments stay non-negative. the parameter values used in the model are given in table 1 with their description and units. . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.11.20061952 doi: medrxiv preprint 3 diseases free equilibrium and stability analysis one of major concepts in an outbreak is disease free equilibrium (dfe), where the entire population is susceptible (keeling & rohani, 2008; diekmann, heesterbeek, & roberts, 2010 ). for the system (1), the dfe can be denoted to able to get the dfe for the system (1), we set the right hand side of the system (1) to zero and substitute the dfe into the system. hence, the dfe is found as we then analyze whether the dfe is stable or not. next-generation matrix (ngm) is used (van den driessche & watmough, 2002; diekmann et al., 2010; van den driessche & watmough, 2008) to determine the stability of dfe. we rewrite our system (1) as: where x = (s * , s * q , e * , i * , i * q , r * ) and i = 1, ..., 6 and hence, f and v are calculated for the system of (1): notice that individuals which transition to e compartment are the only newly infected cases. therefore, the jacobian at the dfe for the infected classes (first three components, x) are cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 /2020 and then, we compute the next generation matrix (ngm) as the spectral radius of the ngm is the basic reproduction number, r 0 defined as the average number of secondary cases arising from an average primary infected case in an entirely susceptible population. the dfe is locally stable if r 0 < 1 (van den driessche & watmough, 2002; diekmann et al., 2010) . the spectral radius of the ngm given in (2) since we do not consider the individuals in s q compartment as a part of the dfe, we do not see any effect of quarantine on r 0 . however, s q indirectly changes the other parameter values such as β, i q , therefore, r 0 value changes with s q indirectly. note that α, β are positively correlated with r 0 and i q , γ i , µ i , d are negatively correlated with r 0 for the system (1). note that we might control the disease with increasing quarantine rate of infected individual i q . thus, if then the dfe is locally stable and the disease dies out when sufficiently close to dfe. biologically, if the infected individuals can be detected in a sufficiently short time another word, the number of test to detect the number of cases increase, then the disease can be controlled. in this part, we estimate the parameters in the system (1), so we fit our model with the daily reported cumulative number of cases and deaths, which are provided by (world health organization, 2020b) and (chinese physicians, 2020). we use the ordinary least squares (ols) method and minimize the sum of the squares of differences between the daily reported data and those predicted by our model. the goodness of the fit is measured by computing the associated relative error of the fit using the formula where c i andĉ l are exact and estimated cumulative(infected) cases, and d i andd l are exact and estimated cumulative deaths. to estimate the number of covid-19 deaths, we sum the number of deaths coming from the infected class i and the reported (infected) class i q . note that the natural deaths in the infected class i and the reported (infected) class i q are also included in the total number of deaths. we used an ode45 solver with fmincon from the optimization toolbox of matlab. by using the initial conditions: s(0) = 59, 000, 000, s q (0) = 0, i q (0) = 258, and r(0) = 0, we estimate all the parameters of the model together with estimating the initial number of exposed e and infected i, except the natural death rate d, recruitment rate π, and incubation period α. we used 4.2 days for the average incubation period . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 /2020 that is provided by (nishiura et al., 2020; guan et al., 2020; sanche et al., 2020) . the natural death and recruitment rate are provided by (world health organization, 2020a). the simulation results obtained for the cumulative number of (infected) cases c and cumulative deaths d by fitting the model with the data from january 20, 2020 to march 23, 2020 are depicted in figure 2 . these figures show a reasonably good fit with the total relative error 0.06 (6%). most of the error comes from the fit of cumulative cases, especially around february 12, 2020. in february, china began to report clinically diagnosed cases in addition to laboratory-confirmed cases, and on february 12, 2020, 13,332 clinically (rather than laboratory) cases reported even though they were diagnosed in the preceding days and weeks. due to the very small number of cases reported after march 23, 2020, we chose to fit the model using only data from before this date. table 1 ). in this section, we discuss 15-day forecasting of the outbreak, the effect of quarantine, and the effect of testing in hubei. we also conduct a sensitivity analysis to see which parameters play important role in the dynamics of the outbreak. when we look at the change in the quarantine class, nearly the entire province of hubei was quarantined by february 15, 2020 (see figure 3 , february 15 corresponds the day 25 in the figure). the percentage of the population transitioning from the susceptible class to the quarantine class attains its maximum level between january 27, 2020 and february 10, 2020, and by february 15, almost all of the population were in quarantine. this result makes sense since the state government imposed a quarantine in the state on january 23, 2020, initially recommending quarantine and finally forcing the people into quarantine to guarantee social distancing. this action seems to have worked to great effect, reducing the contact rate by about 98.9% (see table 1 for the parameter, r, the reduction rate due to the quarantine). . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.11.20061952 doi: medrxiv preprint when we reduce the quarantine rate, s q from 0.096 to 0.0864 (10% reduction) and do not change the remaining parameters, the number of cases and deaths would be about 141090 and 6562, respectively. similarly, when we increase the quarantine rate, s q from 0.096 to 0.1056 (10% increase), the number of cases and deaths would be about 39334 and 1829, respectively. thus, any change in the quarantine rate makes very significant change in total number of cases and deaths. furthermore, see the sensitivity analysis section below, the quarantine rate is a significant parameter in the dynamics of the outbreak as well as its efficiency, which is explained by the parameter, r is also significant in the dynamic of the outbreak. we used parameters in table 1 for 15-day forecasting. figure 4 shows the estimated number of infected cases for 80 days. the plot on the left depicts the estimated number of exposed and the right plot depicts the estimated number of reported (infected) cases i q . as it can be seen, the number of individuals in each of these classes tends to zero, which implies that the outbreak is almost over, and so new cases may not be recorded in hubei. as it can be seen from the change in infected class, the outbreak reaches its peak about february 9, 2020. the infected class i also shows how many people were out with no symptoms or mild symptoms during the outbreak. figure 4 : the plot on the left depicts the number of exposed cases and the plot on the right depicts the number of infected cases with initial conditions s(0) = 59, 000, 000, s q (0) = 0, e(0) = 142, i(0) = 69, i q (0) = 258, r(0) = 0 for 80 days in figure 5 , the plot on the left shows the estimated number of cumulative reported (infected) cases and the right . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.11.20061952 doi: medrxiv preprint plot shows the estimated number of cumulative deaths. as of 30 march 2020, there were no reported cases in hubei in the past week and the total number of cases and total number of deaths were 67801 and 3187, respectively. our model (1), predicts the number of cases and deaths with high accuracy with 6 percent relative error. we estimated the fatality rate of the outbreak in hubei as approximately 4.8% with the estimated number of cases, about 67994 and deaths, about 3254. several parameters play important roles in the model (1). these parameters were estimated with existing data as of (coronavirus covid-19 global cases by johns hopkins csse, 2020). in order to determine the set of parameters that are statistically significant regarding the number of cumulative infected cases, we conduct a sensitivity analysis of the model. we utilized a latin hypercube sampling (lhs) and the partial rank correlation coefficients (prcc) method (marino, hogue, ray, & kirschner, 2008) . we use a range given in table 2 to sample parameters from a uniform distribution, then use these samples as input variables when we run the system (1) with initial conditions s(0) = 1000, s q (0) = 0, e(0) = 10, i(0) = 3, i q (0) = 0, r(0) = 0 for 90 days. the number of cumulative infected cases is the output variables in sensitivity analysis. table 2 shows prcc values, p-values and the range for each corresponding parameters. the sensitivity analysis indicates that β, s q , i q , and r are statistically more significant parameters depending on the . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https: //doi.org/10.1101 //doi.org/10. /2020 high prcc values in the dynamics of the outbreak. therefore, it is of interest to study how the number of cumulative infected cases changes when s q , i q , r, and β are varied and other parameters are held the same as in table 1 and the initial condition same as before. figure 6 shows the results of these experiments, how the number of cumulative cases changes for different values of s q , i q , r, and β. it is also important to analyze how r 0 value varies with β and i q . thus, we vary β in the range [1, 10] and i q in the range [0.1, 4] while keeping all other parameters the same in table 1 . figure 7 shows the boxplot of β and i q . we observe i q affects r 0 in a wider range compare to β. thus, the range of r 0 will change roughly between 3 and 8. in addition, the value of r 0 drops below 1 when i q is above 5.1. . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.11.20061952 doi: medrxiv preprint the rate of reported (infected) cases, i q is related to the number of tests given during the outbreak to identify the infected people. thus, increasing the number of tests will increase the rate of case reporting i q . this will reduce the number of cases (see figure 6 ) and, consequently, the number of deaths due to the outbreak. when we increase the rate of reported (infected) cases i q by about 10%, the number of cases and number of deaths are estimated to be 36040 and 1639, respectively. decreasing the rate about 10%, the number of cases and number of deaths are estimated to be 14084 and 6724, respectively. in the part, we fit the model (1) with available covid-19 data from turkey ((ministry of health (turkey), 2020)). we fit the model (1) with turkish data from march 10, 2020 to april 10, 2020, and get about 5.9% relative error in the fit by using the equation (3). we estimate the four parameters i q , s q , β, and r, which are not only the most significant parameters in the dynamics of outbreak, but also are specific to each country since they are related to the number of covid-19 tests administered i q , the number of individuals in quarantine s q , the contact rate of individuals β, and the efficiency of quarantine r in each country. therefore, by using the initial conditions: s(0) = 83, 000, 000, s q (0) = 0, i q (0) = 1, and r(0) = 0, we estimate these four parameters together with the initial number of exposed and infected individuals. we do not estimate the rest of the parameters, employing the parameters in table 1 . therefore, our results in this section will depend on observed dynamics of the outbreak in hubei as well as the available turkish data (((ministry of health (turkey), 2020)). note that the quarantine rate and the rate of reported cases (which, we stress, is related to number of covid-19 tests) can be increased, and the increase still may have significant effect toward the reduction of the number of cases (see figure 6 , sensitivity analysis), but increasing the reduction rate r does not make very significant changes by way of the number of cases in turkey since it is very close to its maximum level (see figure 6 , sensitivity analysis part). thus, we will vary only the quarantine rate s q and the rate of reported cases i q in forecasting the peak of the outbreak and the number of cases/deaths in turkey. . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https: //doi.org/10.1101 //doi.org/10. /2020 the rate of reported cases is about 1.8; this rate is larger than what we observed in hubei. this implies that in terms of numbers of covid-19 tests conducted per day, turkey is now doing a better job than hubei, china at a comparable time in hubei's outbreak. the efficiency of quarantine also seems to be very good in turkey, given the approximately 85% reduction in the contact rate of covid-19 obtained by our parameter estimation. on the other hand, the quarantine rate is about 0.088, which is small when compared with the quarantine rate in hubei (the rate was 0.096 in hubei). in hubei, the population transitioned to quarantine class very quickly (almost in two weeks), but in turkey the movement to quarantine has been very slow in comparison (see figure 8 ), suggesting why the contact rate is higher in turkey when we compare to the contact rate in hubei (see table 1 and 3 for these rates). it is still possible to increase the quarantine rate (the rate, per day, of transition to quarantine class) and the number of covid-19 tests given each day in turkey to make a reduction in the number of cases and deaths (see figures 9 and 10 ). in figures 9 and 10 , the red curves are obtained using base parameters from table 1 and 3, and the other curves obtained by varying the quarantine rate s q and the rate of reported cases i q . when we use the base parameter values which are obtained from our fitting, turkey then will have about 203,700 cases and 8,269 deaths. if turkey can increase the number of individuals in quarantine and the number of daily covid-19 tests, then, depending on the magnitude of the increases, the number of cases and deaths can decrease significantly (see figures 9 and 10 ). when we look at trajectories of cumulative cases and deaths in figures 9 and 10 , in the worst-case scenario (the black curves) of the study, turkey will have about 281,500 cases and 11,430 deaths. these projections decrease to 148,100 cases and 6,005 deaths if turkey can increase the number of individuals in quarantine and the number of covid-19 tests. . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.11.20061952 doi: medrxiv preprint figure 9 : cumulative number of (infected) cases depending on different quarantine rate s q and rate of reported cases i q . left graph shows the cumulative number of cases between day 1 to day 40, and right plot shows the cumulative number of cases between day 1 to day 150 in the outbreak in turkey. figure 10: cumulative number of deaths depending on different quarantine rate s q and rate of reported cases i q . left graph shows the cumulative number of deaths between day 1 to day 40, and right plot shows the cumulative number of deaths between day 1 to day 150 in the outbreak in turkey. the peak of the outbreak in turkey is also very sensitive to the quarantine rate s q and the rate of reported cases i q . depending on the change in quarantine rate and the rate of reported cases i q , the peak of outbreak in turkey can be seen between the day 42 (april 20,2020) and day 48 (april 26, 2020), and the outbreak will almost die out by the day 150 (at the end of july 2020, see figure 11 ). . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04.11.20061952 doi: medrxiv preprint figure 11 : projected (simulated) peak of outbreak in turkey depending on different quarantine rate s q and rate of reported cases, i q our analysis suggests that quarantine greatly reduced the number of cases and deaths seen in hubei's covid-19 outbreak. in addition, while quarantine does not appear in the representation of r 0 , it still indirectly reduces r 0 . we also saw that the dynamics of the outbreak is very sensitive to the quarantine rate s q and contact rate β, as indicated by our sensitivity analysis. the basic reproductive number is estimated as 5.49 and the study shows that any change in β or i q directly affects the basic reproductive number. the quarantine decidedly reduces the number of cases and deaths. increasing (or decreasing) the speed of movement from the susceptible class to the quarantine class by about 10% would double (or half) the number of cases and deaths due to the outbreak (this speed of movement is controlled by the rate s q ). of course, the efficiency of the quarantine is also very important. in our model, the efficiency of the quarantine measured by the reduction rate, r. the reduction rate shows how much reduction is effected in the contact of covid-19 thanks to the quarantine. based on our sensitivity analysis, this parameter is very important (see figure 6 ). our model shows that the quarantine in hubei was almost perfect since it caused about 98.9 percent reduction in the contact rate of covid-19. another important parameter that plays a crucial role in the dynamics of the outbreak is the rate of reported (infected) cases i q which is directly related to the number of tests given to detect infected individuals. similar to the quarantine rate s q , the rate of reported (infected) cases i q could double (half) when we have 10% reduction (or increase) in the rate. as of 30 march 2020, there were no reported cases in hubei in the past week and the total number of cases and total number of deaths were 67801 and 3187, respectively. based on our 15-days forecasting, the number of cases in hubei was projected to be about 67994 and the number of deaths was projected to be about 3254. thus, we estimate the fatality rate of the outbreak to be about 4.8% in hubei. our model gives about 6% relative error and we are confident that using the model will be helpful for forecasting local outbreaks of the pandemic in other regions. from existing covid-19 data from turkey and the dynamics of our model understood from the hubei analysis, the outbreak in turkey is expected to reach its peak between april 20 and april 26 depending on the number of individuals (amount of people) in quarantine and the number of covid-19 tests carried out each day in turkey. the daily number of tests given in turkey is large when we compare to the rates of reported cases in hubei. as we showed in the sensitivity analysis, increasing the number of covid-19 tests and the number of individuals in quarantine will significantly reduce the number of cases (and deaths). based on our forecasting, the number of cases will be about 203,700 with the range 148,100 and 281,500, and the number of deaths will be about 8,269 with the range 6,005 and 11,430 depending on quarantine rate, s q and the rate of reported cases, i q in turkey. thus, in any cases that are given in figure 9 and 10, the fatality rate of covid-19 will be about 4.1% in turkey. . cc-by-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 /2020 small changes in quarantine rate make significant changes in the total number of cases and deaths in turkey. the efficiency of the quarantine in turkey is about 85 percent, meaning that it causes 85 percent reduction in the contact rate of covid-19. thus, the quarantine rate, s q and its efficiency is very important to be able to contain the outbreak (see table 3 for reduction rate, r and figure 6 for the effect of reduction rate in the total number of cases). as of april 10, 2020, the number of covid-19 tests given each day in turkey had increased to 30,000 ((ministry of health (turkey), 2020)). if the number is increased further, then it also will decrease the total number of cases and deaths in turkey (see figure 9 and 10). indeed estimating the effective reproduction number of the 2019-ncov in china a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster. the lancet. chinese physicians. (2020). ncov.dxy.cn mathematical modeling and the epidemiological research process coronavirus covid-19 global cases by johns hopkins csse. (2020). as the construction of next-generation matrices for compartmental epidemic models others (2020). clinical characteristics of 2019 novel coronavirus infection in china clinical features of patients infected with 2019 novel coronavirus in wuhan, china estimating the potential total number of novel coronavirus cases in wuhan city modeling infectious diseases in humans and animals elements of mathematical ecology. cambridge a methodology for performing global uncertainty and sensitivity analysis in systems biology coronavirus disease 2019 (covid-19) daily data serial interval of novel coronavirus (2019-ncov) infections. medrxiv novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions pattern of early human-to-human transmission of wuhan 2019-ncov the novel coronavirus, 2019-ncov, is highly contagious and more infectious than initially estimated modelling the epidemic trend of the 2019 novel coronavirus outbreak in china reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission further notes on the basic reproduction number crude birth and death rate data by country novel coronavirus (covid-2019) situation reports evolution of the novel coronavirus from the ongoing wuhan outbreak and modeling of its spike protein for risk of human transmission preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak the authors would like to acknowledge the generous support of the turkish ministry of national education in the study. the authors declared no competing interests.authors' contributions all authors contributed equally to this work. key: cord-292537-9ra4r6v6 authors: liu, fenglin; wang, jie; liu, jiawen; li, yue; liu, dagong; tong, junliang; li, zhuoqun; yu, dan; fan, yifan; bi, xiaohui; zhang, xueting; mo, steven title: predicting and analyzing the covid-19 epidemic in china: based on seird, lstm and gwr models date: 2020-08-27 journal: plos one doi: 10.1371/journal.pone.0238280 sha: doc_id: 292537 cord_uid: 9ra4r6v6 in december 2019, the novel coronavirus pneumonia (covid-19) occurred in wuhan, hubei province, china. the epidemic quickly broke out and spread throughout the country. now it becomes a pandemic that affects the whole world. in this study, three models were used to fit and predict the epidemic situation in china: a modified seird (susceptible-exposed-infected-recovered-dead) dynamic model, a neural network method lstm (long short-term memory), and a gwr (geographically weighted regression) model reflecting spatial heterogeneity. overall, all the three models performed well with great accuracy. the dynamic seird prediction ape (absolute percent error) of china had been ≤ 1.0% since mid-february. the lstm model showed comparable accuracy. the gwr model took into account the influence of geographical differences, with r(2) = 99.98% in fitting and 97.95% in prediction. wilcoxon test showed that none of the three models outperformed the other two at the significance level of 0.05. the parametric analysis of the infectious rate and recovery rate demonstrated that china's national policies had effectively slowed down the spread of the epidemic. furthermore, the models in this study provided a wide range of implications for other countries to predict the short-term and long-term trend of covid-19, and to evaluate the intensity and effect of their interventions. novel coronavirus pneumonia (coronavirus disease 2019, covid-19) break out firstly in wuhan, hubei province, china in december 2019, then the epidemic became prevalent in the rest of the world. with the research on covid-19 so far, through the comparison of the gene sequence of the virus with that of the mammalian coronavirus, some studies found that its source may be related to bat, snake, mink, malayan pangolins, turtle and other wild animals [1] [2] [3] [4] . covid-19 can also cause severe respiratory diseases such as fever and cough [5] , and there is a possibility of transmission after symptoms of lower respiratory diseases [6] . however, unlike sars-cov and mers-cov, covid-19 is separated from airway epithelial cells of patients [6] , yet the mechanism of receptor recognition is not consistent with sars [7] . therefore, the pathogenicity of covid-19 is less than that of sars [8] , and its transmissibility is higher than that of sars [9] . in addition, this new coronavirus presents human-to-human transmission [10] , and close contact could lead to group outbreaks [11] . as of july 7th, 2020, 85,359 confirmed cases and 4,648 deaths had been reported in china [12] . in addition to china, there are over 200 countries and regions in the world with a total of 11,630,898 of confirmed cases and 538,512 of deaths [12] . the outbreak of covid-19 happened right before the lunar new year, which is typical chinese spring festival transportation period. with a population of over 11 million, wuhan is one of the major transportation hubs in china as well as a core city of the yangtze river economic belt. the time and location of the outbreak further led to the rapid spread of the epidemic in china [13] . since there is still no vaccine or antiviral drug specifically for covid-19, the government's policies or actions play an important role in flatting the epidemic curve [14] . from the perspective of public health, the interventions of wuhan government have achieved the purpose of reducing the flow of people and the risk of exposure to the diagnosed patients, and also effectively slowed down the spread of the epidemic [15] . nevertheless, covid-19 can be transmitted by asymptomatic carriers [16] , and some of the recovered patients may still be virus carriers [17] . in order to implement non-pharmaceutical interventions more effectively, we used a combination of epidemiological methods, mathematical or statistical modeling tools to provide valuable insights and predictions as benchmarks. for the study of infectious diseases like covid-19, sars, and ebola, most of the literature used descriptive research or model methods to assess indicators and analyze the effect of interventions, such as combining migration data to evaluate the potential infection rate [18, 19] , understanding the impact of factors like environmental temperature and vaccines that might be potentially linked to the diseases [20, 21] , using basic and time-varying reproduction number (r 0 & r t ) to estimate changeable transmission dynamics of epidemic conditions [22] [23] [24] [25] [26] [27] , calculating and predicting the fatal risk to display any stage of outbreak [28] [29] [30] , or providing suggestions and interventions from risk management and other related aspects based on the results of modeling tools or historical lessons [31] [32] [33] [34] [35] [36] [37] [38] [39] . some literature only used one kind of model to simulate and predict the course of diseases. for instance, to use relatively common epidemiological dynamics models like seir or sird to forecast epidemic trends and peaks in certain provinces, even the world [9, [40] [41] [42] [43] [44] ; to apply some other types of statistical models such as the logistic growth models or time series approaches to analyze the epidemic situation [45, 46] , or to develop new models to support more complex trajectories of epidemics or to predict the number of confirmed cases and the spatial progression of outbreaks [47] [48] [49] . several studies were further expanded based on the basic epidemic dynamic models. for example, joining the border protection mechanism with the seir model to better identify high-risk groups and infected cases [50] ; adding the effect of media or awareness into basic models to assess whether these outside influences would possible change the transmission mode of infectious diseases [51, 52] ; or according to transmission routes contained in dynamic models, using a multiplex network model or transmission network topology to analyze the outbreak scale and epidemic spread more accurately [53, 54] . a small number of studies combined the analysis capabilities of two types of models, like seir model and the recurrent neural networks model (rnn), to determine whether certain interventions could affect the results of outbreak control [55] . however, we did not find any analysis method using geographically weighted regression (gwr) on covid-19 study based on our literature research. there is also a lack of understanding the model efficacy of predicting the epidemic curve among different algorithms. in this study, an seir's extended model seird was used to simulate the epidemic situation in china and to predict the number of confirmed and cured cases in each province and several major chinese cities. an lstm model combined with traffic data and a gwr model were used to predict the number of confirmed patients. specifically, gwr model showing geographical differences was used to predict the development of epidemic situation and analyze the impact of geographical factors. this paper also compares the characteristics and prediction ability of these models. in the absence of vaccines and drugs for covid-19, it makes sense to use multiple models to show the situation and intensity of non-pharmaceutical interventions needed to simulate and guide the control of outbreaks. daily updated covid-19 epidemiological data used in this study were retrieved from national health commission of china [12] and accessed via https://github.com/wybert/openwuhan-ncov-illness-data. the daily number of outbound from wuhan city and relevant migration indice from january to march were collected from an online platform called baidu qianxi [56] . the demographic data and medical resources data were from china urban statistical yearbook published by the national bureau of statistics as shown in s1 table. this study used seird model and the changes in the status of the susceptible (s), exposed (e), infected (i), recovered (r) and dead (d) population in the total population (n) are shown in fig 1. according to the medical characteristics and clinical trials of covid-19, both confirmed patients and asymptomatic carriers have the ability to transmit the virus. therefore, susceptible people have a certain chance to become infected after they come into contact with exposed or infected individuals [43] . carriers in the exposed status may develop obvious symptoms after the incubation period and become diagnosed or they may be recovered. the final status of individuals can be basically divided into two categories: one is the recovery from the combined effects of treatment in hospital and autoimmunity, and the other is the death without effective treatment. in the model formula, the infectious rate β needs to be adjusted in real time to adapt to the trend of disease development. in the middle and late stages of the epidemic, the number of daily new cases decreased significantly due to the positive influence of government policies. thus, to better fit the model, we added an attenuation factor desc to β. based on the basic seird model formulas [57, 58] , our modified model was shown as eqs (1) (2) (3) (4) (5) (6) . here, the parameter t denotes the time; β is the infectious rate; α is the rate for the exposed to be infected; γ 1 is recovery rate for the exposed; γ 2 is the recovery rate for the infected; k is the mortality rate; "desc" is the attenuation factor for β, so that β decays exponentially when 0