Pages

Saturday, March 21, 2020

SIR, SIS and COVID-19 in Pakistan

Much work has been done on social networks and interactions, and how they relate to diffusion in a society. This diffusion can be of ideas, technology, disease or anything else that can be transmitted from one person to another. Given the recent Covid-19 outbreak, it might be useful to look at some basics to see if it can provide some insight into the dynamics of contagion and possible mitigation steps.
For the purpose of this post I’ll be using the example of a viral infection.

Diffusion and Contagion

The most fundamental model of contagion in a society is the SIR model. The acronym stands for the three states a person in a population can be in with regards to a contagion:
  1. Susceptible – hasn’t gotten the virus yet, but could get it from another person
  2. Infected – has the virus and can potentially infect someone else
  3. Removed/Recovered – had the virus, but is no longer a threat, either through recovering and developing and immunity, or through dying.
Another version of this model is the SIS model, in which a person returns to being susceptible after having been infected.
The basic idea behind this model is that people randomly come into contact with each other and there is a possibility of contagion if an infected person meets a susceptible person. There is no transmission if people coming into contact with each other are either both susceptible or both infected (or both removed).
Given this setup, the graph of total number of people infected over time look something like this. This is a cumulative figure which includes everyone ever infected. The underlying graph of new cases per time looks like this and serves as a rough indicator of how many people are infected at a particular point in time.

 

We see that in the first half, the number of cases start slow, but grow at an increasing rate. This is because, initially, there are very few people who can infect others. Therefore, random meetings do not have a high probability of causing the transmission of the virus and the number of new cases doesn’t grow very rapidly. However, any interaction by an infected person is likely to lead to a new case, because infected people are mostly coming into contact with susceptible people. Over time, the number of infected people grows and the rate of new cases also increases. At the halfway mark, the rate of new cases coming in is at its peak – half the population is infected, and there is a fifty percent chance that any interaction an individual has is with an infected person.
After reaching the halfway mark the infection slows down. This means that the number of new cases is starting to shrink. This is simply because now most of the population is already infected. Thus, random interactions are more likely to be between people who are already infected and not lead to new cases. Simply put, there are fewer susceptible people around to infect leading to fewer new cases of infection.
This S-curve looks somewhat like what we’ve seen from China (though of course there are differences in the mechanism as they quarantined effectively). Most other countries are at different stages in the first half of the curve. Obviously, the total number of people infected in the end can be reduced if we limit the number of people coming into contact with each other. In those cases, the infection will start to slow down earlier as, again, there are fewer people left to infect.
By exercising distancing, countries are trying to reduce the number of interactions between people, and consequently slow the number of new cases of infection. The intention of ‘flattening the curve’ does not seem to be to prevent everyone from getting the virus at all. Certainly, for vulnerable groups, such as the aged and those with compromised immune systems, the hope is that they will not catch the virus at all.
For everyone else, however, the intention is more to prevent hospitals from being flooded by cases all at once. Thus, the hope is that anyone infected gets the best possible care and improves their chances of recovery. I think it is implicitly understood that most people will probably get infected and the hope is that they will develop an immunity to it. The UK took a novel (and extreme) position on this called ‘herd immunity’ where they decided to let a large number of people get infected, so that they develop an immunity and are no longer contributing to the spread of infection. In terms of the SIS model, this means that people move to the Recovered/Removed category. They have now abandoned this idea as being too risky (among other things, it was pointed out that we don’t know yet whether Covid-19 follows a SIR type model or an SIS one).    



Detection

Now that we have some basic understanding of the dynamics of contagion, we can more towards questions of detection, which is in itself an interesting problem.
Another way to think about contagion is through the spread of a virus along a network. People are connected to each other through real-life social networks. They are likely to interact with people they have some relationship with and thus the system of infection operated within these networks. The size of each component in a network could give some idea as to how much the virus will spread. So, an individual can get infected through contact with someone in their family and then pass it along to people at work who then pass it along to their friends and family and so on. Everyone in this chain of events is in the same component of a network. Thus, the chances of spread are high when people are highly interconnected. Here again, we can see that the concept of distancing is an attempt at severing the links and break the network into smaller components to block the paths of contagion.
By thinking in terms of networks, we also come to another realisation. People who are very well connected, or at the heart of a network, are much more susceptible to infection than people on the peripheries. If I have only one social connection, and that connection is in turn only connected to me, there is practically zero chance of either of us catching viral infection. On the other hand, if I have a lot of connections, or am connected to people who have a large number of connections, they I am much more susceptible to infection because it would just take one of those many people to get infected for me to be exposed to it.
Christakis and Fowler decided to use this principle to develop an early detection system for outbreaks. Their idea was that people who are more central in a network (i.e. they are well connected, directly or indirectly) are more likely to catch a contagious disease earlier than people selected at random. They tested this theory in a study and found that it did indeed hold. Students who had more friends or were more central to a network went through the flu cycle several days before a randomly selected group of students did. Thus, their idea was that one could monitor people who were more central in a network and use them as markers for an early warning system for the general population.
Perhaps governments such as Pakistan that have limited capacity to carry out large scale testing should be looking towards targeting their testing. It seems they already are targeting in that they are checking travellers and people with symptoms. But maybe they should also be looking at other factors as well – such as centrality of individuals. How this centrality is established is a tricky question that will have to be figured out.
The approach taken by Christakis and Fowler was very interesting in that it used the concept of the Friends Paradox. The idea was basically that if you choose people are random and asked them to list their friends, the names of people who are more connected are more likely to turn up. This is because if person A has 1 friend, and person B has 99 friends. When you choose an individual at random, you are more likely to pick someone who is friends with person B than with person A. So, this is exactly what they did, they tracked people chosen at random, and compared the progress of their flu against people who had been named as friends.
This might not be a practical approach for a government to take (or maybe it is – ask people in a community to nominate a person to be tested?), but it might be useful to think about how they could achieve the same goal. So, one might imagine that people in professions that bring them into contact with a large and diverse group of people would be prioritised.
I am assuming that these are the kinds of questions experts in the government are tinkering with now.

 

Interpreting Detection


Finally, I move to the question that I feel is critical and needs to be investigated in the current situation. Given that we have very few tests compared to the size of the population, how do we interpret them in terms of establishing:
  1. The current penetration of the infection in the population
  2. The direction/stage that the infection is currently in.
Standard practice for estimation in static problems is to draw a random sample of a population and do calculations on the sample. For example, if one wanted to find out how many people in a community possess a mobile phone, one would randomly select a sample of individuals and calculate the proportion with a mobile phone. Given that these individuals are selected at random, there’s no reason for me to be systematically over- or under-estimating the average of the community. The larger the sample, the more confident one would be about the estimate being an accurate representation of the entire community as it reduces the chances that the sample I drew was freakishly different from the community as a whole.
That estimate provides me with a snapshot that gives me an estimate of mobile phone owners in a community at that point in time. Over time, more people may buy mobile phones, and the validity of my result will depend on how rapidly that happens.
In the case of testing for a viral infection however, we are faced with multiple deviations from this simple setup. Firstly, the testing is not done at random. Generally, we seem to be testing people who are assumed to be more likely to be infected (or people who are self-selecting). If the assumption is true, then the number of positive cases as a percentage of people tested would be much larger than the proportion of cases in the general population. If the assumption is false, we may end up thinking the estimates are over-stated when they are not!
However, reporting of cases is done in terms of total numbers rather than as percentages. Due to asymptomatic cases, this can never be an overestimate and is most likely an underestimate. How does one establish confidence levels in this setup?
A huge systemic issue arises from the complex dynamics of such a problem. It seems to me that the fact that a person with Covid-19 can take a long time to exhibit symptoms (if at all), along with the uncertainty on transmission rates and recovery rates, added to the exponential nature of growth in infection numbers means that daily updates are very difficult to interpret. Getting any assumption on these figures even slightly wrong can completely change how we interpret the results from our tests.
One infected person today who isn’t tested, could turn into three next, and nine the next, all before the first person exhibits any symptoms. On the other hand, if everyone gets tested at the same time today, the snapshot we get of the situation could be outdated before the results are even available.
This is an issue that has been bothering me but doesn’t seem to be coming under much discussion – we’re getting numbers, but what do they mean? One hopes that the government has some structure in mind in terms of how they establish the system for testing that provides them with genuine insight into the state of the virus in the country rather than haphazardly generating numbers that are difficult to interpret and act upon.
Finally, a major problem with systems such as this that they exist around tipping points. A millimetre in the wrong direction and the outcome of the system changes completely. So, for example, if one were to carry out a vaccination drive that requires 98% inoculation to be effective, getting 97.5% does not mean that the job is almost completely done. It means the opposite – 97.5 inoculation is closer to zero than to 98. This is probably why governments around the world are more realistically talking about flattening the curve rather than obliterating it.
However, the government response in such a case will require them to do some serious mathematic and data collection activities. It will also require them to achieve and exceed targets. One hopes they are up to the task!



No comments:

Post a Comment