A (Not So) Quick Word About Recall Bias

I was reading through some of the reviews of a restaurant the other day when I read some comments by several people who swore that they had been made sick by food from that restaurant. One commenter stated that they had become “gravely ill” soon after leaving the restaurant. Another commenter agreed, saying that they had become ill “about a half hour” after eating at the same establishment. Soon after that, others piled on. As I watched the ratings site, I was very upset to see what became a comedy of stupidity hours later.

Judging by the comments, the incubation time for their disease was between 30 minutes and TWO WEEKS. Not only that, but their onsets were days and weeks apart from each other. This leads me to one of two possible conclusions: 1) The restaurant has an enormous problem with regards to hygiene to the point that they are making people sick on a prolonged scale spanning weeks. Or 2) the commenters were exhibiting – at the very least – recall bias and/or – at the very worst –  a mob mentality.

Then again, they could all have been the same person with some sort of vendetta. (I’m not linking or publishing the exact quotes because the restaurant already has enough issues.)

It is very natural for us to associate our illness to the very last thing we ate before we got sick, especially if we are not familiar with things like “incubation times” or the modes by which viruses and bacteria that we eat can make us sick. For example, Norovirus takes just a few viral particles to make a person sick. The incubation time – the time from infection to symptoms – ranges from 24 to 48 hours with Norovirus, certainly not 30 minutes. That is, you’re completely symptom free for about a day before you get really sick form Norovirus.

Salmonella and E. coli make you sick through the cunning use of toxins. Alright, alright… They don’t do it on purpose. It’s just that some of their metabolic byproducts of their own cell membrane may act as a toxin once in our gut. Their incubation times? 12 to 72 hours for Salmonella and 3 to 4 days for E. coli. Again, no where near the 30 minute mark. And certainly not two weeks later.

What could cause disease in 30 minutes or less or your money back?

Staphylococcus aureus or Bacillus cereus can make you sick in 30 minutes after ingesting their toxins… But it’s a stretch in this case, especially in light of others reporting such disparate incubation times.

This is why it is necessary for health departments and health care providers to educate the public on the nature and behavior of gastrointestinal disease – and other diseases as well. Because that lack of understanding can not only lead to a restaurant or other food businesses to be wrongly accused of making people sick – which can have them go under financially – it can also muddy up investigations of serious food borne outbreaks.

How Many Was That Again?

Have you ever noticed that reports of case counts from public health sources usually have the word “reported” included in them? You have, haven’t you? Well, have you ever wondered why that is so?

Click to enlarge

The reason for that is because of the inherent nature of epidemiological surveillance and the barriers to getting an exact case count for every single disease or condition out there. Some of these issues with surveillance make for an overestimation of the number of cases. Other issues make for an underestimation of the number of cases. In all cases, it is highly unlikely that you are seeing the true number of cases in any report from public health.

Does that make these reports not useful or even – as some will claim – “manipulated” in any way? Not necessarily, and let me tell you why…

The first thing you need to understand in analyzing descriptive data presented to you from public health sources is the case definition being used in counting cases. A case definition is usually presented in terms of person, place, and time. For example, a case of Salmonella food poisoning may be defined as “anyone with a stool culture positive for Salmonella who ate avocados in Pittsburgh in the week of December 8 to 15″. That’s pretty specific, right?

Case definitions can also be very broad, like saying that a case of Salmonella food poisoning is “anyone with gastrointestinal disease with an onset of December 10 to 17”. This definition would surely bring up many more cases than the cases from the previous, more stringent case definition. So you can see why you need to know exactly what defines a case.

Likewise, you need to know what diagnostic tools are being used to define a case. In our example above, we used a stool culture to define the specific case definition and a clinical description of “gastrointestinal disease” to define the second. When being presented with data, make sure that you know what diagnostic tool – or tools – was (were) used. It makes a big difference.

For example, in the late 1970’s and early 1980’s, we had very little with regards to technology to isolate the Human Immunodeficiency Virus (HIV). So an HIV infection had to progress to Acquired Immune Deficiency Syndrome (AIDS) – a collection of signs and symptoms of the deterioration of the immune system – in order to define a case of HIV infection. AIDS itself was very broad at first, and the definition then was refined. As more and more diagnostic tools have been made available, the case definition of HIV and AIDS has changed. Where the presence of an opportunistic infection was once enough to diagnose a person with AIDS, there are now lab tests to look at the white blood cell counts and diagnose earlier in order to intervene and treat earlier.

The example with HIV/AIDS above is true of autism as well. It used to be that there was no uniform diagnosis for autism – or any of the conditions that fall within the autism spectrum. Children were either “hyper”, or “retarded”, or “slow”, or had some other condition. As medical science began to understand what it meant to be on the autism spectrum, the definition of someone with autism changed, leading to better recognition of cases and a subsequent rise in the prevalence – the underlying rate of disease in a population – that we see now.

Incidentally, the case definition for autism became more sensitive and specific – and thus more accurate – around the same time that vaccines began to be more abundant and more recommended. This lead to the misperception that vaccines raised the rates of autism and not the better diagnostic tools. But that is for a whole other discussion.

It goes without saying that an improvement in surveillance methods also leads to a change in the number of cases observed and counted. For example, infant mortality reporting has gotten better as more and more health care providers in the United States are able to report infant deaths electronically. Health departments at all levels of government are more active in their surveillance of cases by surveying hospitals, clinics, and even midwives on the survival numbers of infants. So you can see how this extra effort to count the deaths that were previously not reported has led to the belief that the infant mortality rate in the country has increased.

Other countries don’t have the same systems as we do in the United States. As a result, their infant mortality rates are different – even lower –  than those observed here. Is it true, then, that the US is failing in controlling infant mortality compared to countries with less resources? Nope. It’s all in how we’ve been counting the numbers. Apples to apples, the rates are much better in the United States, where expectant mothers have better access to prenatal care and children are – for the most part – born in medical facilities capable of caring for them if they are in trouble.

So here is what you do when you compare two rates of a single disease either across time, across location, or even across populations of people. You need to make sure that the case definitions of both datasets are comparable and as close to matching as possible. Otherwise, you really are comparing apples to oranges. You also need to look at the diagnostic methods used for each dataset. There is no use in comparing one dataset whose cases were diagnosed based on symptoms – a subjective way of diagnosing – and another dataset whose cases were diagnosed by a lab – an objective way of diagnosing. Finally, you need to look at the surveillance system that collected these data and make sure that the systems for both sets of data are – yet again – comparable. If one relied on providers reporting cases while the other went out and looked for cases, then – yet again – you will find yourself comparing apples to oranges.