What is scaling factor dilatations

COVID-19 Coronavirus Disease Spread Analysis in German Regions and the World

In early 2020 the SARS-CoV-2 virus (colloquially known as the coronavirus) spread across the globe and caused a pandemic of COVID-19 disease. This page presents a collection of daily updated charts and analysis of disease spread in German regions and the world. The focus lies on displaying the data of the past and deriving trends for the near future. In order to compare data of different regions, data is scaled by the regions' population. Charts are updated daily in the morning (UTC). Raw data and plot scripts are available in my GitHub repository. This is a private non-profit spare-time project. All data shown here rely on the quality of the data sources, so I cannot guarantee for the correctness of my data.

Data sources

Feel free to drop me a message if you find a typo, bug or bigger issue. I am happy for inspirations for further analyzes as well.

Stay safe, Torben

Table of Contents

Further good information sources

Germany / Germany

Counties

Map: animation of the spread in the counties

Click on the graphic to get to the animation.

Back to top

Number of counties with new infections in the last week

The following graph shows the number of German rural and urban districts per incidence interval as a time course.

Analogous to this, the increase and decrease in incidences compared to the previous week:


Back to top

County table with intensive care unit utilization

There is now also a configurable e-mail notification for district data.

A click on a line adds this district to the interactive district graphic.

May 6, 2020: In Germany, the limit value of 500 new infections per 1,000,000 inhabitants in seven days in a city / district was set as a guideline for tightening measures. Therefore, I have now sorted the county table according to this value in the "Infected per million inhabitants per week" column.

generated via tab, raw data can be found here

Back to top

Interactive district comparison

Select counties and cities in the table above. The data can be zoomed in with the button above the legend. The configuration of the graphic is stored in the URL (address line of the browser) and can thus be saved and shared.

generated via eCharts, raw data can be found here

Back to top

Configurable county email notification

(Click on the screenshot to get to the registration)

Back to top

Federal states

Interactive federal state comparison

generated via eCharts, raw data can be found in de-states

Back to top

Infected, dead and intensive care unit utilization due to COVID-19 per federal state

Back to top

Doubling time of new infections

08/22/2020 Until today I had thought and hoped that the subject of "exponential growth in the number of cases" would be off the table. Unfortunately, the latest numbers speak a different language. Therefore, today I created a new evaluation that quantifies the trend (increase and decrease). I calculated the doubling time or half-life of the new infections. Doubling time is the time in which a value has doubled, half-life halved analogously.

Numerical example for clarification
Assuming there are 10 new infections today and the doubling time of 20 days remains constant, this results in
20 days → 20 new infections
40 days → 40 new infections
60 days → 80 new infections
80 days → 160 new infections
...
For this reason, a short doubling time is dangerous and a short half-life is good.

The following graphics show the associated doubling time (red) or half-life (green) based on the number of new infections per week. These were determined by regression over a period of 14 days in each case in the past, see below for details. The evaluation is inspired by Konstantin Tavan's presentation.

Result: In March the doubling time in Germany was mostly less than 5 days. The downturn from the beginning of April, with a half-life of 10-15 days, was significantly slower than the increase before. Since the end of July we are in the range of a doubling time of 20 days.

For the sake of completeness, now a brief explanation of the mathematical background

To determine the doubling time or half-life, I "fitted" the data of an interval with an exponential growth function (approximated using a regression analysis). The algorithm determines 2 parameters of the growth function in such a way that the growth function is scaled to an optimal correspondence with the data. Statements on growth can then be made from the values ​​of the parameters determined in this way.

Exponential growth function:
f (x) = N0 exp (b x)
with N0: scaling factor / value at the time (x = 0 = today)
and b: general parameter that describes the growth.

In order to convert the parameter b into a more understandable doubling time T, simply make the following definition:
f (T) = 2 f (x = 0)
→ N0 exp (b T) = 2 N0
→ b = ln (2) ÷ T
This can be done analogously for halving. But since ln (0,5) = -ln (2), the result is almost identical, only the sign of the half-life T is negative.
Note: The doubling time / half-life T is independent of the scaling factor N0. It is therefore not necessary to scale the data to the population to determine the doubling time.

I have now repeated the above routine for each day on the x-axis in order to determine a time course of the doubling time or half-life (with decrease). I use the interval of the y-values ​​of the data for this day and the 13 previous days for each day. I use the values ​​-13..0 (days) as x-values ​​for each of these intervals. The N0 scaling factor is initiated with the y-value of the current day (but also optimized by the algorithm).

Back to top

Federal states - infections

Newly infected people every 7 days

Representations of the time course of COVID-19 diseases in the German federal states. I have scaled the numbers to the population of the federal states so that they can be compared with each other. Since the daily numbers fluctuate strongly and weekends have a clear effect on these fluctuations, I have used 7-day differences in the representations.


generated via Gnuplot plot-de-states-timeseries-joined, raw data can be found in de-state-XX.tsv
Total number of infected people

generated via Gnuplot plot-de-states-timeseries-joined, raw data can be found in de-state-XX.tsv
Ranking by infections

Ranking of the federal states according to the total number of infections ("Cases per million population" series) and color-coded incidence ("Cases Last Week per 100,000 population" series).

Ranking of the federal states according to incidence (series "Cases Last Week per 100,000 population") and color-coded the total number of infections (series "Cases per million population").

Back to top

Federal states - victims

In the representations of the number of victims I have given reference values ​​for other causes of death on the right-hand side, see this table below for sources. Note: the number of deaths is about 3 weeks behind the number of infections. On average, patients die 14 days after the first symptoms, which in turn appear around 3-5 days after infection.

Newly deceased per 7 days

Total number of victims

generated via Gnuplot plot-de-states-timeseries-joined, raw data can be found in de-state-XX.tsv

Back to top

Comparison of deaths 2016-2021

Based on the data from the German Federal Statistical Office, I compared the deaths in 2020 with previous years. All data series were smoothed with a moving average of 7 days and the 29.2. has been removed.

For the comparison with the number of victims of COVID-19, I initially averaged the deaths per calendar day for 2016-2019 and plotted them together with the data from 2020. In the lower area you can see the difference between these two data series in addition to the COVID-19 victim numbers.


generated via Gnuplot plot-de-mortality.gp, raw data can be found in de-mortality.tsv

Here are some of our neighboring countries: Belgium, Great Britain, France, Switzerland, Spain.

Back to top

Further evaluations

How high is the probability that an infected person is present when meeting people?

Here is a short excursion into stochastics / probability theory with the aim of estimating the risk of infection when meeting people.
Assumption for the estimation: the number of new infections per week corresponds to the number of currently infectious people. It is important not to ignore the number of unreported infected people, a factor of 4 on the official case numbers seems realistic to me. So if 25 new infections per 100,000 inhabitants are reported in a region, I assume that 100 people per 100,000 inhabitants are currently contagious. The table below shows how likely it is that a contagious person is present for different group sizes. The Excel calculation for this, which also contains a "pocket calculator" for the free entry of values, is linked under the table.


generated via Excel probability-infected-meeting.xlsx

Event risk assessment

Based on the assumptions made in the previous section, here is a "calculator" for estimating the risk of events / crowds. The aim of this rough estimate is not to provide an exact number but the order of magnitude of the probability depending on the infection situation and the number of participants.

Comments on the input variables, assumptions and simplifications for this rough model:

  • The number of new infections per week is approximated by the number of currently infectious people. This is supported by the RKI's indication of an average duration of infection of 8-9 days.
  • Dark figure: the RKI estimates a range of 4.5–11.1.
  • Mandated quarantine is not taken into account. This could be taken into account by reducing the number of unreported cases by up to 1 (only the infectious and still walking around pose a threat to their surroundings). However, many people do not find out about their test results until they are already infectious.
  • Infectious persons are assumed to be evenly distributed across the entire population and not differentiated according to age groups etc.

Comment on the result:
The likelihood that an infection will occur depends on many other factors such as duration, environment and protective measures taken and is very difficult to estimate as a probability in%. I am grateful for ideas and hints, you can contact me here.

Back to top

How many days does the number of victims lag behind the number of infected people?

In the following figure I have plotted the newly infected and the newly deceased for Germany (each per 7 days and per 1 million inhabitants). I then shifted the curve of the newly deceased and scaled it to the spring peak of the new infections.

Result from October 28, 2020: In spring 2020 the delay between infection and death was about 14 days with a scaling of 4.3%. In the summer of 2020, the curve of the newly infected with this scaling is above the curve of the number of victims. Reasons for this could be found in the number of tests carried out and in the age profile of the infected.

Back to top

How high is the unreported number of infected people compared to the official number of cases?

Here the attempts to estimate the number of unreported cases (= difference between the total number of infected people and those who tested positive). I made the following very simplified rough assumptions (inspired by this article, section "Washington State"):

  • Anyone who dies today was infected 3 weeks ago
  • An infected person dies with a 1% chance
  • The doubling time of the number of victims is determined from the data of the last week and kept constant for the prognosis into the future

This allows the total number of people infected 3 weeks ago to be calculated backwards from 100x (reciprocal of 1%) the number of people who died today. This calculation can now be carried out for each day X: number of deaths on day X times 100 assigned to day X-14. This can be used to calculate the total number of today extrapolate infected persons via a regression analysis / fit of the data (more on this below). This looks graphically as follows:


generated via Gnuplot plot-de-calc-deaths.gp, raw data can be found in de-state-DE-total.tsv

Comment on the assumptions: The mortality is probably lower than the assumed 1%. It usually takes 3-4 weeks from infection to death, not just 3 weeks as I calculated. Both effects increase the number of unreported cases.

Back to top

Vaccination dates

RKI evaluation of the vaccination data

The following graphic is from the RKI and is only embedded here.
Source with data in Excel format.

Back to top

Vaccination Dashboard
Vaccination dashboard of the Federal Ministry of Health (BMG)

Side project: CO2-Traffic light

Here is the reference to my other Corona project CO2- Traffic lights for our school and day care center. Motivation: The CO2-Concentration in the air is a good indicator that it is time to ventilate. If you want to participate, please contact me. Current status:


Back to top

Investigation of the exponential increase in infections in Germany

This chapter is no longer relevant because, fortunately, we have left the area of ​​the exponential increase in new infections. So I archived it.

Countries Worldwide

Map: Country Casualties

Map deaths per million population, borrowed from Wikipedia. Data source: the json file provided by github.com/pomber/covid19 is based on data of Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE)

Back to top

Country table

Click on a row to add it to the country comparison chart below.

"Cases Doubling Time" is derived from an exponential fit of the "Cases Last Week" of the last 14 days.

generated via tabulator, raw data can be found in countries-latest-all.tsv / countries-latest-all.json

Back to top

Country comparison chart

Usage hints

  • Browse and click in country table above to search and select or deselect interesting countries
  • Download your chart using the button top right
  • Share a link to your chart by copying the updated URL of this page

Nomenclature

  • Cases means positively tested on COVID-19 and reported to JHU. Note: I do not trust Cases data for international comparison too much, as it strongly depends on testing and reporting.
  • Deaths are much more reliable than Cases, I suppose. Deaths are furthermore stronger related to the hospital workload.
  • Per million means scaled per million population. These series should be used when comparing countries.
  • New means change with respect to the previous day.
  • Last week means change with respect to 7 days in the past (= rolling week). It is therefore smoother than the New series.

3. Chart

generated via eCharts, raw data can be found in country-XX.tsv / .json
Many thanks to Attila Andrási-Nagy for code review, cleanup and implementation of the first version of the select data logic!

Back to top

Doubling Time of New Infections

In the following the exponential growth of the new infections is analyzed. If an increase is found, I fitted the data with an exponential function to drive the new cases doubling time. As data I used the "Cases Last Week" series of the last 14 days. The resulting doubling time is the number of days it takes for a doubling of new cases, shorter is worse, of cause. Color coded is the value of "Cases Last Week Per Million" as additional information, red means high.

Back to top

Event Risk Calculator

Assuming an event / company / school with many people attending. What is the probability of a COVID-19 infectious person being there?

I made the following simplifications for this rough model

  • The number of new infections per week is approximated by the number of currently infectious people. This is supported by the German RKI, stating an average duration of infection of 8-9 days.
  • Unreported cases: For Germany the RKI expects a range of 4.5-11.1. The more tests are performed, the lower this number should be.
  • Quarantine is not taken into account. This could be accounted for by reducing the factor of unreported cases by up to 1 (only the infected and walking around are a threat to their surroundings). However, many people retrieve their test results when they are already infectious and are still around others.
  • The infectious people are assumed to be evenly distributed across the entire population and not differentiated according to age groups, etc.

Comment on the result: The likelihood that an infection will occur depends additionally on many other factors such as duration, environment and protective measures taken and is very difficult to estimate as a probability in%. I am grateful for ideas and hints, here you can contact me.

Back to top

Country Rankings

Cases

Ranking countries by total cases (series "Cases per Million Population") and color-coding their current new cases (series "Cases Last Week per 100,000 Population").

Ranking countries by current new cases (series "Cases Last Week per 100,000 Population") and color-coding their total cases (series "Cases per Million Population").

Back to top

Deaths

Ranking countries by total deaths (series "Deaths per Million Population") and color-coding their current new deaths (series "Deaths Last Week per Million Population").

Ranking countries by current new deaths (series "Deaths Last Week per Million Population") and color-coding their total deaths (series "Deaths per Million Population").

Back to top

Comparison of selected countries

Current situation

Deaths, absolute values.


generated via Gnuplot plot-countries.gp, raw data can be found in countries-latest-selected.tsv

Deaths, scaled by population of the countries, to make them comparable.


generated via Gnuplot plot-countries.gp, raw data can be found in countries-latest-selected.tsv

See table below for reference data: deaths by other causes

Back to top

Timeseries

First using absolute values.


generated via Gnuplot plot-countries-deaths.gp, raw data can be found in countries-timeseries-XX.tsv

Now again re-scaled to the population of the countries, to make them comparable.


Now weekly new deaths per million population. I decided to use this delta of the last 7 days (rolling week) instead of daily new deaths values, since there are strong weekend and other effects present, leading to wrong conclusions.


generated via Gnuplot plot-countries-deaths.gp, raw data can be found in countries-timeseries-XX.tsv

See reference table for the numbers used.

Back to top

Reference data: deaths by other causes per year.

Back to top

Time series, Doubling time calculation and forecast for selected countries

As in all countries the exponentially increase of the death toll is stopped, this chapter has been archived.


Home - Contact - Imprint