key: cord-354073-tn76muv6
authors: Jen, Tung-Hui; Chien, Tsair-Wei; Yeh, Yu-Tsen; Lin, Jui-Chung John; Kuo, Shu-Chun; Chou, Willy
title: Geographic risk assessment of COVID-19 transmission using recent data: An observational study
date: 2020-06-12
journal: Medicine (Baltimore)
DOI: 10.1097/md.0000000000020774
sha: 
doc_id: 354073
cord_uid: tn76muv6

BACKGROUND: The US Centers for Disease Control and Prevention (CDC) regularly issues “travel health notices” that address disease outbreaks of novel coronavirus disease (COVID)-19 in destinations worldwide. The notices are classified into 3 levels based on the risk posed by the outbreak and what precautions should be in place to prevent spreading. What objectively observed criteria of these COVID-19 situations are required for classification and visualization? This study aimed to visualize the epidemic outbreak and the provisional case fatality rate (CFR) using the Rasch model and Bayes's theorem and developed an algorithm that classifies countries/regions into categories that are then shown on Google Maps. METHODS: We downloaded daily COVID-19 outbreak numbers for countries/regions from the GitHub website, which contains information on confirmed cases in more than 30 Chinese locations and other countries/regions. The Rasch model was used to estimate the epidemic outbreak for each country/region using data from recent days. All responses were transformed by using the logarithm function. The Bayes's base CFRs were computed for each region. The geographic risk of transmission of the COVID-19 epidemic was thus determined using both magnitudes (i.e., Rasch scores and CFRs) for each country. RESULTS: The top 7 countries were Iran, South Korea, Italy, Germany, Spain, China (Hubei), and France, with values of {4.53, 3.47, 3.18, 1.65, 1.34 1.13, 1.06} and {13.69%, 0.91%, 47.71%, 0.23%, 24.44%, 3.56%, and 16.22%} for the outbreak magnitudes and CFRs, respectively. The results were consistent with the US CDC travel advisories of warning level 3 in China, Iran, and most European countries and of level 2 in South Korea on March 16, 2020. CONCLUSION: We created an online algorithm that used the CFRs to display the geographic risks to understand COVID-19 transmission. The app was developed to display which countries had higher travel risks and aid with the understanding of the outbreak situation.

Since the outbreak of the 2019 novel coronavirus disease in Wuhan city, China, on January 30, 2020, [1, 2] a total of 182,185 confirmed cases and 7148 deaths had been reported by March 16, 2020 , [3] involving 31 provinces/cities in China as well as 162 countries/regions outside of China. [4] The total number of deaths (=7148) has substantially surpassed those from (final toll of 774 deaths in 2003) and the Middle East respiratory syndrome (final toll of 858 deaths in 2012). [5] [6] [7] 1.1. Travel information required for knowledge of COVID- 19 risk In an influenza pandemic, the strength of the increase in confirmed cases is a proxy for epidemic size and disease transmissibility. [8] The US Centers for Disease Control and Prevention (CDC) has established geographic risk-stratification criteria for the purpose of issuing travel health notices for countries with COVID-19 risk and guiding management decisions for people with potential travel-related exposure to COVID-19. [9] Four strata have been established:

(1) limited community transmission, (2) sustained (ongoing) community transmission, (3) widespread, sustained (ongoing) transmission, and (4) widespread, sustained (ongoing) transmission and restrictions on entry to the United States. For instance, on March 16, 2020 , the entry of foreign nationals from China and Iran was suspended.

The CDC recommended that (1) travelers avoid all nonessential travel to the following destinations (China, Iran, and most European countries), and (2) older adults or those with chronic medical conditions consider postponing traveling to South Korea.

These represent the 3 levels of notice based on the risk presented by the outbreak and the precautions that are needed to prevent infection, including watch level 1, alert level 2, and warning level 3.

Although a number of factors were involved in publishing the geographic risk stratification, including size (e.g., the number of confirmed cases), geographic distribution, and epidemiology of the outbreak, [8] none of these objectively observed criteria were provided to us for our assessment of the COVID-19 situation for each country/region.

As of February 29, 2019, more than 377 articles related to COVID-19 were searchable with the keyword "covid-19 or 2019-nCoV" on PubMed Central (PMC). [10] The Johns Hopkins Center for Systems Science and Engineering (JHC) has built an online dashboard and regularly updates the data to track the worldwide spread of the 2019-nCoV outbreak [3] with the hope of providing the public with a better understanding of the COVID-19 outbreak. However, the JHC [3] and other dashboards [4, 11, 12] only provided visual dashboards of the world map and included little information on the outbreak and bubbles for counties/ regions. No solid geographic risk assessment for COVID-19 transmission has been seen yet on the internet, including on those websites [3, 4, [13] [14] [15] [16] [17] [18] providing simple and widely available information (e.g., the number of confirmed, deaths, and recovered cases based on countries/regions along with death rate, transmission rate, incubation period, as well as discussions on age and demographics) to the public. None were found to be equipped with travel information that would fulfill the public's needs.

Rasch models, [19] which were named after Georg Rasch, are a family of psychometric models for creating measurements from categorical data, such as answers to questions on a reading assessment or questionnaire responses with a function of the trade-off between a. respondent ability and b. task difficulty. [20] In addition to psychometrics and educational research, the Rasch model and its extensions have been used in other areas, including the health profession [21] and market research, [22] because of their general applicability. [23] Our goal was to determine whether Rasch analysis could be used for inspecting epidemic magnitudes by observing the pattern of daily confirmed cases. The reasons for the use of the Rash model include that 1. all responses were ordinal within a specific range (e.g., from 0-5 on a Likert-type scaling survey), 2. all regions and days (like persons and items on a test) were on an equal interval continuum with a unit of logit (=log odds) in comparison, [21, 24] 3. sequential assessments that estimate the epidemic magnitudes and examine the COVID-19 situation for each country/region instead of using the cumulative confirmed cases with the traditional method ignoring the recent cases, which have greater weight (i.e., of importance) in determining the outbreak magnitudes.

The (CFR is related to the following questions:

(1) How deadly is this? and (2) how many people will die in this outbreak? The severe acute respiratory syndrome , the Middle East respiratory syndrome, Ebola, and H1NI yielded real CFRs of 9.6%, 34.4%, 73%, and 0.4%, respectively, [5] [6] [7] and the CFR for COVID-19 has been discussed in numerous articles. [25] [26] [27] The World Health Organization, in a press conference on January 29, 2020, announced that the death rate of COVID-19 was 2% based on the CFR calculation (= deaths/cases). [4, [28] [29] [30] [31] This figure was substantially underestimated because it assumed 1. no lag days from symptom onset to death (i.e., death tolls registered and confirmed many days ago) [27] and 2. all currently infected cases had totally (i.e., 100%) recovered.

Bayes's theorem (alternatively Bayes's law or Bayes's rule) describes the probability of an event based on prior knowledge of conditions that might be related to the event. [32] It is necessary to use the post-CFR to adjust the prior-CFR for each country/region on COVID-19 to examine the geographic risks. This is because the post-CFR might be increased if the conditional probability of death is greater than the counterpart of recoveries according to the equation, PðA1jBÞ ¼ PðBjA1ÞPðA1Þ

, where the probability of (P (A1), CFR) is based on the shared portions of (1) conditional deaths and recoveries: P(BjA1) and P(BjA2), and (2) the total possibility (e.g., PðBÞ ¼ PðBjA1Þ Â PðA1Þ þ PðBjA2Þ Â PðA2Þ for a particular region, P(A1) = 1 À CFR). The shared portions can be used to more accurately assess the probability of (P(A1), CFR), which can be done without the knowledge of the shares using Bayes's theorem for estimation.

In the current study, we were motivated to apply Bayes's theorem to estimate the adjusted CFR for countries/regions on COVID-19.

The aims of the current study were to 1. visualize (i) the outbreak magnitude and (ii) the adjusted CFRs for countries/regions in recent days 2. develop an algorithm that classifies countries/regions into categories of outbreak epidemics and shows then on Google Maps, and 3. design an app for better interpreting the geographic risk of COVID-19 transmission.

We downloaded COVID-19 outbreak numbers on March 16, 2020, from GitHub, [13] a site that provides information on newly confirmed cases in more than 31 Chinese locations and other countries/regions. All downloaded data (in Supplemental Digital Content file 1, http://links.lww.com/MD/E415) were publicly displayed on the website. Ethical approval was not necessary for this study because all the data were obtained via the internet. [13] 2.2. Rasch model for obtaining the outbreak magnitudes

The Rasch analysis [33] was performed online using authordeveloped codes. [34] All responses were derived from ordinal scores using the logarithm functions (i.e., using the Excel function round (LN(confirmed cases),0) from 0 to 5) for each region in

China and other countries. The geographic risks for COVID-19 transmission were determined by both the outbreak magnitudes with a unit of logit (log odds) and the adjusted CFRs based on Bayes's theorem.

We defined the adjusted post-CFR, as shown in Eqs. (1) and (2) as follows:

PðBÞ

PðBjA1Þ ¼ Deaths in the regiion Total deaths in all regions ; ð3Þ

PðBjA2Þ ¼ Recoveries in the region Total recoveries in all regions ; ð4Þ

where P(A1jB) denotes the post-CFR, P(B) stands for the burnouts (or loading dealing with those currently infected cases in the respective region) on COVID-19, and P(BjAi) represents the conditional probabilities observed from the structure (or pattern) in deaths (=A1) and recoveries (=A2). P(A1) and P(A2) are the prior-CFR (=deaths/confirmed cases) and the probability of recoveries (=1-CFR), respectively; in (3) and (4), the adjusted post-CFR is higher if P(BjA1) is greater than P(BjA2). Otherwise, the post-CFR is less than the prior = CFR. As such, the transmission risk can be denoted by the adjusted post-CFR because these two metrics in Eqs. (3) and (4) are unequal. Imagine that at the end of the outbreak course, both P(BjA1) and P(BjA2) converge to have identical values and lead both post/ prior-CFRs to be equal.

World maps have been used to show disparities in health outcomes across areas in many disciplines, [35, 36] such as dengue outbreaks, [37, 38] disease hotspots, [39] and the Global Health Observatory (GHO) maps on major health topics. [40] A Kano diagram [41, 42] was used to highlight the geographic risks of countries/regions. The Kano diagram was used to divide areas into three groups; bubbles were colored by latitude (i.e., higher 40 in green and below 23.5 in red) and sized by doubling days for the confirmed cases of COVID-19 (i.e., days it takes to double the number of confirmed cases starting from at least 10 cases). The formula of 1/d * 10 was applied to transform the doubling days into a scale, with higher means spending fewer days to increase the number of confirmed cases.

Rasch logit scores are on the axis X and adjusted CFRs on the axis Y. The number of confirmed cases in the recent 20 and 10 days were transformed into ordinal scores from 0 to 4, respectively, for comparison. On the other hand, we plotted countries/regions on the Kano diagram, dividing them among four features represented by different colors:

1. ready to increase (yellow), 2. increasing (green), 3. starting to decrease (light green), and 4. decreasing (red).

A specific algorithm was applied to the categorization of the features mentioned above. Three types of line charts were provided to verify that the 4 features were fully supported.

A dashboard app was designed for a daily updating geological display of the epidemic situation for travelers. We examine whether the Rasch model could be applied to evaluate the riskalert level for COVID-19 by examining the advisories of the US CDC. The study flowchart is shown in Figure 1 and Supplemental Digital Content file 2, http://links.lww.com/MD/E414. Fig. 2 ).

If the last 10 days were applied to measure the geographic risks for regions, the top seven were Germany, Iran, South Korea, Italy, Spain, Sweden, and Norway, with {3.59, 3.59, 2.53, 2.23, 2,23, 1.88, and 1.99} and {0.23%, 13.69%, 0.91%, 47.71%, 24.44%, 0.54%, and 0.23%} for the Rasch scores and CFRs, respectively (see Fig. 3 ).

Readers are invited to scan the QR codes in Figures 2 and 3 to see details about the information on Google Maps, such as the doubling days for the confirmed cases on COVID-19: 5 and 7 days for Hubei (China) and South Korea.

It is worth noting that Hubei (China) has fallen behind on the outbreak magnitudes because the outbreak situation has been gradually improved if the data from the last seven days are used for reporting.

The results were consistent with the US CDC travel advisories of warning level 3 in China, Iran, most European countries, and level 2 in South Korea on March 16, 2020.

The top 3 countries/regions (Italy, Spain, and Iran) with the highest COVID-19 transmission risks were particularly highlighted with symbols from 1 to 3 using the confirmed cases in the recent seven days dated March 16, 2020 (Fig. 5) . The bubbles were sized according to the number of confirmed cases and colored by feature (i.e., ready to increase, increasing, starting to decrease, and decreasing). We can see that counties in Europe have green bubbles. In contrast, many regions (or provinces in China) have black bubbles, indicating that there has been no confirmed case in the last 7 days. 24 Medicine We suggest that readers scan the QR-code in Figure 5 and click the link about the 3-line charts for the region of interest.

The 4 features of the outbreak for each country/region are shown in Figure 5 . We can see that the bubbles were sized by the number of confirmed cases and colored by feature (e.g., increasing in green and decreasing in red). The line charts regarding the details appear when the bubble of interest has been clicked.

We confirmed that the information in Figure 2 by using Rasch analysis and the adjusted CFRs could highlight the travel risk on COVID-19. The results were consistent with the US CDC travel advisories of warning level 3 in China, Iran, and most European countries, and level 2 in South Korea on March 16, 2020.

In an influenza pandemic, the strength of the increase in confirmed cases is a proxy for epidemic size and disease transmissibility. [8] The US CDC has established geographic risk-stratification criteria for the purpose of issuing travel health notices for countries with the risk of COVID-19 transmission and guiding public health management for people with potential travel-related exposures to COVID-19. [9] However, there is no objective measurement system that can help us visualize the transmission risk of COVID-19 for travelers. In this study, we provided visual representations based on the risk posed by the outbreak using Rasch analysis and the CFRs based on Bayes' theorem, which was a rare strategy in the literature.

Many dashboards and websites [3, 4, [13] [14] [15] [16] [17] [18] provide daily COVID-19-related information. None of them display such sophisticated messages on the ongoing epidemic situations as those from the Rasch modeling technique and the Bayes' theorem (Figs. 2-5) . Although choropleth maps have been popularly applied in the healthcare setting, [35, 36] the 2 major features of outbreak magnitudes and CFRs are included in this study to display the high travel risk for COVID-19 transmission, which differentiates this study from others [3, 4, [13] [14] [15] [16] [17] [18] 43] that only provide the number of confirmed cases or other simple information, particularly with bubbles sized by the number of confirmed cases and merely colored without other meaningful features.

We provide 2 main algorithms that display the outbreak magnitudes and CFRs to highlight the regions with the highest transmission risk, which are rarely seen in the literature but are of importance to revealing the epidemic transmission risk. However, with complex computations, these 2 algorithms can be routinely run on the internet, which allows us to easily examine the daily progress of the outbreak, as we have shown in the previous figures. QR codes have been provided to readers to examine the detailed information on any regions of interest on the dashboards via Google Maps.

The post-CFRs were used to examine how the particular risks appeared in regions. In this case, the 7 countries/regions were within our expectations and were listed on the US CDC website on March 16, 2020, [9] indicating that the results were reliable.

Two main strengths of the current study include 1. the epidemic trend displayed under the Rasch measurement (X-axes in Figs. 2 and 3); 2. CFRs based on Bayes' theorem, which was enriched in this study (Y axes in Figs. 2 and 3); 3. the geographic risks shown on Google Maps (Fig. 4) ; 4. using 4 features to display all countries/regions in four respective quadrants (Fig. 5) ; and 5. the creation of an app to demonstrate the COVID-10 situations on dashboards that use Google Maps for display.

Our study has some limitations. First, we were more concerned with the transmission risk in certain regions. As such, the numbers of confirmed cases were transformed into ordinal scores (e.g., from 0 to 5) to fit the Rasch model's requirement. Whether the preliminary assumptions on the Rasch model were met (e.g., local independence on items and unidimensional scale) was not examined in this study, though Rasch analysis can be performed on such repeated measures. [44] [45] [46] Second, although we applied CFRs to distinguish the geographic risks, the difference between the prior-and post-CFRs might emphasize the regions with higher risks based on death tolls. In contrast, the Rasch logit scores were focused on the outbreak magnitudes. A greater number of confirmed cases yield higher magnitudes due to momentum.

Third, readers might be doubtful about the different weights, which were created by transforming original counts into ordinal scores using the logarithm function, used in the Rasch analysis. Areas with more confirmed cases have lower weights, similar to the law of diminishing marginal utility in economics. [47] Otherwise, the transformation function can be substituted with other functions, such as equal interval compression (e.g., compress cases/1000 into several categories), to meet the requirement of Rasch measurement.

Fourth, the doubling days for the confirmed cases on COVID-19 have not been discussed much in this study. The use of doubling days in estimating the number of confirmed cases in a region is worth studying in the future. For instance, when the doubling days and the average length of hospitalization for deaths (ALHD) are known, the confirmed cases can be estimated by the formula of 2^(ALHD/DD) * death tolls in a region.

Furthermore, the online Rasch rating scale model [33, 34] was programmed by the authors. Although many visualization models have been developed, other useful diagrams and algorithms, such as diagnosis maps and KIDMAP, [48, 49] can be further elaborated and developed in the future.

Finally, we suggest using both outbreak magnitudes and CFRs to observe the transmission risk in regions. The former concerns the number of confirmed cases, and the latter relates to the death tolls. From these 2 perspectives, we can understand the transmission risks with more confidence, making them worthy of further investigation in the future.

We created an online Rasch modeling algorithm to display a visual representation of the geographic risks of the COVID-19 transmission. We are hopeful that the app will help us better understand travel risks and keep us updated on the situation of the current outbreak.

TWC developed the study concept and design. SC, JCJ, and YT analyzed and interpreted the data. SC monitored the process of this study and helped in responding to the reviewers' advice and comments. TH drafted the manuscript, and all authors provided critical revisions for important intellectual content. The study was supervised by WC. All authors read and approved the final manuscript.

The rate of underascertainment of novel coronavirus (2019-ncov) infection: estimation using Japanese passengers data on evacuation flights

Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak

JHC. Coronavirus disease 2019 (COVID-19) outbreak. Available at

Coronavirus disease 2019 (COVID-19) outbreak

Risks to healthcare workers with emerging diseases: lessons from MERS-CoV, Ebola, SARS and Avian flu

Are coronavirus diseases equally deadly? Comparing the latest coronavirus to MERS and SARS

Estimation of MERScoronavirus reproductive number and case fatality rate for the spring 2014 Saudi Arabia outbreak: insights from publicly available data

Approximate Bayesian algorithm to estimate the basic reproduction number in an influenza pandemic using arrival times of imported cases

Search COVID-19 risk assessment by country

Articles related to COVID-19

Wuhan coronavirus outbreak has now surpassed MERS (final toll of 858 deaths in

Novel coronavirus (ncov) data repository

Novel coronavirus (2019-nCoV) outbreak

CDC tests for 2019-nCoV

European Centre for Disease Prevention and Control (ECDC)

Novel coronavirus (2019-nCoV)

National Health Commission of the People's Republic of China (NHC)

Probabilistic Models for Some Intelligence and Attainment Tests (Reprint, with Foreword and Afterword by

The Rasch model

Rasch Measurement in Health Sciences

Generalizing the Rasch model for consumer rating scales

Solving measurement problems with the Rasch model

Applying the Rasch Model: Fundamental Measurement in the Human Sciences

Evaluation of mobile apps targeted to parents of infants in the neonatal intensive care unit: systematic app review

The comparative effectiveness of mobile phone interventions in improving health outcomes: meta-analytic review

Mobile phone apps for quality of life and well-being assessment in breast and prostate cancer patients: systematic review

Epidemiological characteristics and low case fatality rate of pandemic (H1N1) 2009 in Japan

2019-Novel coronavirus (2019-nCoV): estimating the case fatality rate -a word of caution

Novel coronavirus (2019-nCoV) fatality rate: WHO and media vs logic and mathematics

Novel coronavirus (2019-nCoV) fatality rate is 2%

The Stanford Encyclopedia of Philosophy (Spring

A rating formulation for ordered response categories

Student's performance shown on google maps using online Rasch analysis

Choropleth map legend design for visualizing the most influential areas in article citation disparities: a bibliometric study

Using Google Maps to display the pattern of coauthor collaborations on the topic of schizophrenia: a systematic review between

Dengue outbreaks and the geographic distribution of dengue vectors in Taiwan: a 20-year epidemiological analysis

Recognizing spatial and temporal clustering patterns of dengue outbreaks in Taiwan

Dot map cartograms for detection of infectious disease outbreaks: an application to Q fever, the Netherlands and pertussis

WHO. Global Health Observatory Map Gallery

Attractive quality and must-be quality

Using the Kano model to display the most cited authors and affiliated countries in schizophrenia research

Available at https://ncov2019. live/data

Repeated measure designs (time series) and Rasch

Rasch analysis of repeated measures

Rack and stack: time 1 vs time 2 or pre-test vs post-test

Value, cost, and marginal utility

Some notes on the term: "wright map

KIDMAP: Person-by-Item Interaction Mapping (Research Memorandum #29)

Medicine (2020) 99:24 www.md-journal

We thank AJE (American Journal Experts at https://www.aje. com/) for the English language review of this manuscript.