Introduction
With the COVID-19 pandemic raging across India, we have been under lockdown since March 25th, 2020. It is widely welcomed by close to 1.3 billion people, even though this has bought their lives to a standstill. The 800-pound gorilla in the room, of course, are the questions “When should this lockdown be relaxed and how do we know that we are making progress?”.
In any epidemic, Rt is the measure known as the effective reproduction number. It is the average number of people who become infected by an infectious person at time t. The most well-known version of this number is the basic reproduction number: R0 when t = 0. However, R0 is a single measure that does not adapt with changes in behaviour and restrictions.
As a pandemic evolves, increasing restrictions (or potential relaxing of restrictions) changes Rt. Knowing the current Rt is essential for policy-based decision making. When Rt>1, the pandemic will spread through the entire population. The lower Rt, the more manageable the situation.
The value of Rt helps us in:
- Understanding how effective the non-pharmaceutical interventions have been in controlling the outbreak.
- Giving vital information, regarding whether we should increase or reduce restrictions, based on our competing goals of economic prosperity and saving human lives.[1]
Somehow this particular insight has been mainly missed by the world. Except for Hongkong[2], no one seems to be tracking this, at least on a real-time basis. This number is generally not that useful at the national level. The key aspect is to understand this number at the state or district level, where decisions regarding tightening or relaxing the non-pharmaceutical interventions are implemented.
In this post, let’s try and discuss a framework for this solution for the Indian states of Telangana (that I am based out of), Maharashtra, and Tamil Nadu, where the number of COVID cases seem to be growing at the fastest rate in India.
As part of future work, we will be trying to do the same at the district / city level for a better understanding of Rt at the ground level.
This borrows heavily from the work of Betterncourt and Riberio[3] and also from Kevin’s GithubRepository[4].
Approach
We have an estimate of the number of new COVID-19 patients on a daily basis. We can use this to estimate the current value of Rt. We can also see that the value of Rt will depend on Rt-1 (yesterday’s value) and for every previous value of Rt-n .
We can use Bayes Rule to update our belief about Rt, based on the new infection data that we are seeing each day.
P(Rt | k) = [P(Rt) . Likelihood(Rt | k)] / P(k)
The above equation can be interpreted as, having seen k cases, the distribution of Rt is equal to:
- The prior belief of the value Rt is assumed to be P(Rt)
- Times the likelihood of Rt given that we have seen k cases
- Divided by the probability of seeing k cases under all hypothesis of Rt.
Importantly, since P(k) is a constant, the numerator is proportional to the posterior. As all probabilities sum to 1.0, we can ignore P(k) and normalize the posterior sum to 1.0
P(Rt | k) P(Rt) . Likelihood (Rt|k)
Of course, this is for one day. Generalizing this across all the previous days we have measurements for, we can write the same as
P(Rt | k) P(R0) . Likelihood (Rn|kn) . Likelihood (Rn-1|kn-1)………Likelihood (R1|k1)
With a uniform prior P(R0), this reduces to:
P(Rt | kt)∏ Likelihood (Rt | kt)
One of the potential issues with this Bayesian approach is that the posterior is equally influenced by events in the distant past as much as in the recent past. In our case, this would mean that if Rt> 1 for a long period, and has come under control (Rt< 1) recently, the posterior will get stuck at values > 1 for a long time.
Of course, this would not work for us, because the entire purpose of this exercise is to see when Rt has dipped below 1.
One way to resolve this would be to just use the previous “m” days for calculating the likelihood function, rather than the entire history.
LIKELIHOOD FUNCTION:
We will be using Poisson Distribution as the likelihood function for this analysis, as this is the preferred model for understanding the “number of arrivals” in a given time period. Given an average arrival rate of ‘λ’ new cases per day, the probability of seeing k new cases is distributed according to the Poisson distribution:
P(k|λ) = (λke-λ) / k!
Figure 1: Poisson Distribution
DERIVING Rt FROM λ
The most important feature of this work is to connect Rt to λ. The derivation is itself out of the scope of this blog post, but the derivation can be found here.
Derivation = λ = kt-1eϒ(Rt-1)
The ϒ is taken is the reciprocal of the serial interval (5 days for COVID-19).
The problem can now be written as
Likelihood(Rt|k) = (λke-λ) / k!
As the next steps, we just have to perform the Bayesian update on the most likelihood function, which in this case we have chosen to be Poisson.
Just to Summarize
Data for the Real World
We have used data from the COVID-19 India Tracker website (https://www.covid19india.org/). We have extracted the data for the states of Telangana, Maharashtra, and Tamil Nadu for the period 14th March 2020 to 14th April 2020.
We are in the process of collecting more data, but the present analysis is limited to the above mentioned three states.
Analysis
The analysis has been conducted for each of the three states of Telangana, Maharashtra, and Tamil Nadu.