Results
Can the increase in confirmed cases be explained by the number of
tests conducted?
The argument, that the increase in registered SARS-COV-2 infections
would mainly be driven by a mere increase in the number of test
performed has been repeatedly used by against political countermeasures
. There is a continuous debate about correct interpretation of rising
and falling case numbers in view of varying test activity, yet without
any substantial data provided on that aspect so far.
Unfortunately, neither the RKI nor any other institution in Germany are
in the position to report on the exact number of tests conducted behind
the daily number of laboratory-confirmed COVID 19 cases. The RKI
publishes the number of tests and the rate of positive results reported
by laboratories participating in their surveillance system on a weekly
basis. However, the number of reporting labs was not constant over time,
since obligatory reporting was introduced with some delay. In addition,
not every test corresponds to a new patient tested, since multiple
testing of e.g. persons under quarantine are included. Still, the data
available can be used to estimate the overall testing activity in
Germany, assuming that the surveillance labs have been selected by the
RKI for representativity. The weekly test data were adjusted for the
number of labs reporting and divided by the number of days to obtain
test rates on a per-day-basis.
A linear regression model was built with test activity as explaining and
number of confirmed cases as dependent variable. The model yielded an
adjusted R square of .07 (F = 7.70 at 89 DF, p = .007), meaning a
numerically weak, but still statistically significant effect.
Can changes in the number of confirmed cases be explained by changes
in public mobility?
It has been observed by various parties that observed infection rates in
Germany already started to decline before implementation of lockdown
measures. This is typically explained with the assumption that many
people were concerned and voluntarily followed the public appeals to
avoid unnecessary movements in public long before the official lockdown
was announced. In this context, the “COVID 19 mobility project” was
initiated to measure public mobility based on mobile phone data, which
should serve as an indicator for compliance with the social distancing
policy. The project had met with some initial resistance and concerns
regarding data protection, but it was considered as important by the
RKI.
Data on public mobility as transcribed from the website were included in
the linear model described above as a second predictor for number of
confirmed cases. The effect was impressive, the combined model yielded
an adjusted R squared of .25 (F = 15.04 at 88 DF, p = .000002), with a
highly significant effect for mobility, while the “number of test”
effect was no longer visible. However, what seems to prove that mobility
reduction prevented infections at first sight is quite surprising in
fact. The outcome implies a strong correlation of the number of
confirmed cases, aggregated by date of first symptoms, with mobility on
the same(!) day. This is not to be expected given an incubation period
of about 5 to 6 days . To further investigate if there is also a
predictive value for the number of confirmed cases at a later point in
time, correlation coefficients were calculated with a time lag
increasing from 0 to 8 days. As it can be seen in Figure 1, the
correlation coefficients show a linear decrease with increasing time
lag.
With that, the data do not support the idea, that changes in public
mobility impact the number of confirmed COVID 19 cases. The correlations
rather suggest the opposite, it might be the daily communication of
rising or falling infection rates, that had an impact on peoples’
compliance with the “stay at home” requests. If that would be the
case, it might be expected that the mobility index correlates stronger
with the number of cases based on date of report than with the number
based on date of first symptoms. And indeed, the correlation was found
to be -.51 for the former and -.68 for the latter. The correlation
between mobility and the number of confirmed case five days later, the
estimated mean incubation time, was just -.26.
Do changes in the number of newly confirmed COVID 19 cases per day
occur in a plausible relationship to public health measures taken by
the German government?
Residuals derived from the linear regression model calculated in section
2 were used for further analyses, thus eliminating the influence of
varying test activity. The residuals were submitted to a timeline
decomposition on calendar week basis, using a standard procedure of the
analysis software package. Figure 2 shows the results for trend (i.e.
the 7 days moving average), seasonal trend (i.e. the observed regular
fluctuations per weekday) and the remaining error or random component.
The random component indicates some more dynamic developments between
calendar week 12 and 16 and, at the lower end, around calendar week
21/22. The seasonal curve shows a pronounced periodicity, which is
probably owing to less testing activity on weekends and delayed
reporting, even though the effect is reduced by using the date of
disease onset instead of date of reporting. The moving average over 7
days clearly accounts for most of the variation.
For a more detailed inspection the trend curve was isolated in Figure 3
with line markers added for landmark decisions taken by the German
Government. The timepoints for major interventions in March (“a” to
“c”) were chosen in alignment with the modeling studies by the MPI
mentioned earlier, timepoints “d” (first opening steps on April 20)
and “e” (law on extended testing for asymptomatic patients became
effective, May 22) were added on top. The timepoints “a” to “d” were
marked with a 5 days lag from actual date of intervention to account for
the estimated incubation time, as these were assumed to have a direct
impact on infections. Timepoint “e” was marked with the date, the law
took effect, since testing by itself should not influence infection risk
but might impact the chance of detecting cases.
The curve shows a steep and remarkably smooth increase which starts to
bend slightly before line mark “a”, the day when the first measures –
cancelling of large public events on March 09 - would be supposed to
take effect. After a peak at the beginning of calendar week 12 it starts
to decline, without any visible effect of line marker “b”, the closure
of schools and nurseries (March 16). Between calendar week 14 and 16,
i.e. for approximately two weeks after line mark “c”, the full
implementation of lock down measures in Germany on March 23, the curve
shows a slightly accelerated decline and returns to the previous trend
thereafter. There is no discernible change at line marker “d”, the day
when Germany started to alleviate some of the lockdown measures, while
the curve shows a small peak that corresponds to the date, when the
updated testing policy took effect, allowing for more tests of
asymptomatic patients under certain conditions.
The only political measure implemented in a plausible temporal
relationship to a turning point of the curve was the decision to cancel
large public events on March 09. None of the following decisions on
strengthening or loosening restrictions show a significant impact, even
though there are two weeks when the numbers went down a bit faster. This
probably reflects the true effect of the shutdown measures, which would
not be sustainable and not in a reasonable relationship to their
dramatic economic and social impact. Interestingly, what seems to have
caused an immediate reaction was a slight modification of the national
COVID 19 test strategy put into effect on May 22. The potential bias
brought in by the test strategy will be topic of the next section.
What further sources of bias impede valid conclusions from the number
of confirmed cases per day?
Figure 3 still shows an exponential increase in confirmed cases during
the first weeks, even after the effect of varying test activity has been
eliminated. However, it remains questionable if this observed increase
truly reflects the spread of infection in the German population. Typical
model simulation studies that were trying to estimate the effect of our
measures on number of infected and number of deaths have two implicit
assumptions:
“patient zero” was correctly identified, i.e. it is known when the
virus hit the country.
the number of confirmed cases is always linearly related to the total
number of infected patients.
Both assumptions may need to be challenged in the light of today’s
knowledge. While the first cases in Europe have been officially
confirmed in January this year, genetic analyses, sewage water assays
and retrospective analyses of frozen blood samples provide more and more
evidence that the virus might have circulated at least in France and
Italy much earlier, at least mid-December, maybe November, or still
earlier . The authors of the respective papers already indicated that
this could imply some bias in the assumptions regarding disease
progression, but they did not expand on that and for what I know nobody
has picked the topic up.
In addition, there is increasing and consistent evidence from Germany
and all over the world, that the number of unreported cases is at least
tenfold higher compared to the laboratory confirmed cases. We have
learned that a much larger proportion of cases than expected –
estimates vary between 40% and 85% remain completely asymptomatic or
at best with signs of a common cold, while the RKI, at least in earlier
days, had expected almost every infected person to become symptomatic,
sooner or later.
For most of the time covered in this paper the official recommendations
on PCR testing remained unchanged. Only patients with clinical symptoms
like cough or fever should be tested if they had been in contact with a
confirmed case before. Until they were showing symptoms, contact persons
identified as suspects by the local health authorities were kept under
closer surveillance.
If this test strategy is implemented at the beginning of an outbreak the
number of cases identified will indeed give an idea of what happens in
the population, even if some asymptomatic cases are missed out. The
number of unreported cases will be in linear relationship to the number
of confirmed. However, what happens if testing starts while there
already is a small but substantial share of infected patients in the
region where the first case is detected, e.g. some 2 to 3 percent? This
is a realistic scenario, estimates from one of the first German hotspots
have even been in the region of 15% .
The trigger case could be someone ending up on ICU with severe
pneumonia. Following up on contacts will yield an average of 30 to 40
persons under closer surveillance . This already gives a reasonable
chance to find the next infected, who just might have his slight cold
and normally would not have been tested. Another 40 contacts under
surveillance, another one, an ever-growing group of suspects and 2 to 3
cases per hundred detected, since this is prevalence rate assumed. This
pattern of confirmed cases begetting confirmed cases constitutes a
classic exponential increase, but it does not involve mutual infection.
It rather represents a sort of calibration curve for the test strategy.
After a very short time – some two or three weeks –the point would be
reached when the detectable share of patients has been identified –
those with symptoms at a certain point in time – and only from this
time on the observations truly reflect the increase or decrease of
infection rates in the wider population. Still this does not mean that
new confirmed cases have been infected by previously confirmed cases -
they all might have caught their virus from various asymptomatic
spreaders who never showed up in the statistics.
This is not the only potential bias in public surveillance date, the
other one is a negative sampling bias affecting the perceived risk of
severe course of disease and death. Basically, there are two ways to get
confirmed as a COVID 19 patient, either as a contact of a confirmed case
as described above, or as a patient hospitalized for severe respiratory
syndrome. With that the group of confirmed cases represents a mix of
negative selection (patients with symptoms) and a highly negative
selection (patients requiring hospitalization). Indeed, this can also be
seen directly from the data. One of the graphs provided within the daily
situation update published by the RKI shows how the age groups are
represented within the number of confirmed cases compared to the age
distribution in total population. What can be seen is a more or less
evenly distribution between 20 and 59 years, the working population, and
a tendential decrease in numbers for the Younger and Older, which fits
to the results showing children to be more resilient, while the older
often have reduced social contacts and might thus be less exposed to
potential infections. But then patients older than 70 years are clearly
over-represented compared to all other age groups. About 17% of all
confirmed COVID-cases belong to this group and they account for 85% of
all deaths reported. Interestingly, the percentage of hospitalized
patients has likewise been in the region 17% for most of the time.
This clearly looks like a bimodal distribution and it could be
interesting to analyze the way how these 17% hospitalized patients have
been identified. My prediction would be that a large share of those
patients was not detected by contact tracing, but they were delivered
directly to hospital with severe pneumonia and then tested. The
high-risk group should not just be viewed as a share of all COVID 19
patients but as a distinct sub-population with specific features – one
of them obviously high age – which makes them vulnerable for a severe
course of disease. Any backward conclusion from the currently registered
COVID patients to the general risk in an elderly population is
meaningless since it is unknown how many of them – despite advanced age
– do not have any severe consequences. We observe a number x of
patients who require hospitalization and a number y of patients who do
not, but there does not seem to be a real connection between those
groups. With the current massive expansion of test activities, we
observe a continuously decreasing hospitalization and death rate, which
is in line with this assumption.