Tuesday, May 05, 2020

Experts Worry: Predictive News Headlines in the Age of COVID-19

Cataclysms, personal or shared, have a way of distorting your perception of time.

March 12th, 2020 was the day that I felt the Coronavirus crisis escalate. That day, I felt as though I was living simultaneously in three distinct realities. First and most vividly, there was the physical world around me. It was one of those blissful early spring days - bright and breezy. Some friends were meeting up at the local watering hole for a happy hour that was tinged with a different energy than previous ones. I think we knew it might be the last time we'd see each other in person for awhile, but this didn't make us glum. Instead, there was a kind of enhanced camaraderie, laughing at the craziness of it all, because what else was there to do?

Then, existing in what seemed like another universe, there was the reality that existed inside my computer and phone, on news websites and social media: a world quickly falling apart. There was an ever-escalating series of fear-evoking stories, every one of them true.

And then there was the thing itself: the unknowable reality of threat posed by the virus. Though the virus itself is knowable insofar as we are able to know viruses and what they do to various types of human bodies, the matter of immediate concern - the precise way the virus will spread through a given population and affect each individual person - could not be known. That reality is contingent upon too many things to be knowable, at least in mid-March, and perhaps now in early May and for the foreseeable future. The threat of the virus depends on future government policies at federal, state, and local levels, future workplace policies, the speed with which treatments will be developed and manufactured, and the future behavior of billions of individuals, as each individual's decision to, say, stay home and watch Netflix or say 'fuck it' and go out to a bar (when bars were/will be open) affects all downstream outcomes. Scientists of various stripes can imperfectly predict the spread of the virus and its deadliness by looking at various models based on prior behavior, but they cannot as yet know with absolute certainty (or really much certainty at all, it would seem) who the virus will infect in a particular social context, when it will infect them, how long it will stick around a population, who or how many it will kill.

As I moved into the gently mandated quarantine stage of the event, I started to think more about the disjuncture among these three worlds. A couple of months previous, I'd had time to think about the ways in which the social internet (news, commentary about current events on social media) presented its users with a distorted view of the world. The long and short of it is this: it enhances threats.

In the case of the virus, the threats are multiple. There is the virus itself, and then there is the effect of virus containment on the economy. Also, as always in the U.S. and perhaps elsewhere, there is the threat of the other political tribe: Would Trump invoke martial law? Would protesters spread the virus? Would liberals exaggerate the threat to make Trump look bad?

The key word in all of these questions is 'would.' I came to realize that many of the headlines I read contained words like 'would,' 'could,' 'may.' Some of the bolder ones contained the word 'will.'

They were about bad things that had not happened yet. 

On a podcast from mid-March, Malcolm Gladwell recounted something he had read that day that quoted experts from the University of California San Francisco, a leading medical school. The experts predicted that there would be more than 1 million Americans deaths from the virus.

From March 17, 2020 on the New York Times: 'There may be two to four more rounds of social distancing.'

And later, from CNN.com on April 14th, 2020: 'US may have to keep social distancing until 2022, scientists predict.'

From May 4th in the Washington Post: 'Draft report predicts covid-19 cases will reach 200,000 a day by June 1.' This article alluded to a leaked report from the CDC that also predicted that there will be 3,000 deaths per day in the second half of May.

These headlines were accompanied by opinion-piece headlines that would have seemed more at home on less prestigious news websites a few months ago. From the New York Times: 'One simple idea explains why the economy is in great danger'; 'Stop saying everything is under control. It isn't'.' 'More severe than the great recession.'

Most of the headlines and stories quoted experts who offered informed predictions about the virus or the economy. Right away, I thought of Phil Tetlock's work on expert political judgment. Tetlock found that when political experts of all ideological stripes were forced to make falsifiable predictions about a range of outcomes, they were not much better than chance or non-experts. The more famous the experts were, the worse their prediction records were. In his book The Signal and the Noise, Nate Silver reviews the incentives experts have to make predictions, why they are rewarded for more outrageous predictions with more coverage and fame, and why they are not punished for being wrong.

There are various lessons you might take away from Tetlock's ongoing project to assess the ability of experts to forecast a range of outcomes. The one I keep coming back to is that forecasting any outcomes that involve a large number of people's behavior is really difficult. The more people that may influence the outcome (that is, the more people who's individual behavior is part of the system you're trying to observe and extrapolate from), the harder the outcome is to predict.

Some of the predictions about the virus and the economy that dominated headlines were not falsifiable, as they did not provide a time range and thus could eventually be proven true even if they were not true on a particular date (they would never be false, but simply not true yet). But many were falsifiable: they made specific predictions about the duration or magnitude of an economic recession or depression; they made specific predictions about the number of infections or deaths resulting from the virus (the falsifiability of which assumes you accept data collected by authorities).

Sometimes, the experts would try to convey their levels of uncertainty in their predictions. Sometimes, they would not. Most times, this uncertainty level would not be conveyed in the news article; it was almost never conveyed in the headlines.

Aside from the inherent unpredictability of large groups of people, there is another reason why many forecasts relating to the virus or the economy will turn out to be wrong. Predictions like this can function as warnings that are then heeded by people who take action, which prevents the predicted outcome from occurring. Nate Silver calls this a 'self-cancelling prediction' or 'self-cancelling prophecy.' Similar problems plague predictions about environmental catastrophe: the more dire the predictions, the more likely they are to spur innovation or regulation that prevents the predictions from coming true.

I wonder if some folks are engaging in a kind of deliberately misleading, exaggerating framing of virus threats. Perhaps journalists and those who post on social media are aware of the shortcomings of the data they are working with, aware that they are focusing on the most dire scenarios and ignoring others. They do this because they believe, rightly, that the more dire the predictions, the more likely they will be to spur action that will prevent the more dire predictions from coming true.

I think that people often derive the wrong lesson from self-cancelling predictions. They do not prove the predictions to have been correct. It is possible that the prediction would have been wrong had no action been taken. In and of themselves, they do not provide much evidence of the accuracy of the prediction. I think the better lesson to draw from the possibility of self-cancelling predictions is that in order to have faith in our predictions in which those who learn of the prediction might plausibly affect its outcome, we must understand the mechanisms by which the predicted outcome will or won't occur. We must be able to account for the effects of particular behaviors in isolation (e.g., the effect of social distancing on viral transmission; the effect of carbon monoxide on sea levels) in order to really understand and predict complex phenomena.

The recent spate of predictive headlines brings to mind another domain examined in Nate Silver's book: weather predictions. Meteorologists' predictions of the weather on any given day were often wrong; no surprise there, as weather is another complex, hard-to-predict system. What's interesting is that the errors were systematic: meteorologists tended to predict rain on days that turned out to be sunny more often than they predicted sun on days that turned out to be rainy. As a reason for this, Silver noted that meteorologists were 'punished' for one type of wrong answer more severely than they were for the other. Most people saw sun on a supposedly rainy day as a pleasant surprise, while they tended to get angry at meteorologists who failed to warn them about the negative outcome: rain on a supposedly sunny day.

Many of the predictions dominating news headlines will inevitably turn out to be wrong, but will they be systematically wrong, wrong in a particular direction? I suspect that most journalists and news consumers see virus infections, deaths, and economic hardship in much the same way people see rain: they would rather the predictions turn out to have been too dire than not dire enough.

However, this creates a problem. If the predictions about virus infections and deaths are unnecessarily dire, this will cause consumers to spend less and investors to invest less, leading to worse economic outcomes. If the predictions about the economy are too dire, policy makers, business owners, voters, and consumers will push for re-opening too soon, resulting in worse health outcomes. All unnecessarily dire predictions will likely harm people's mental and emotional health, and any wrong prediction will harm subsequent trust in news sources.

I suspect that predictions in most mainstream news outlets will overestimate negative outcomes associated with the virus and underestimate negative outcomes associated with the economy. Of course, there's a political element to the predictions (those on the Left tend to be more concerned with the virus while those on the Right tend to be more concerned with the economy), but beyond that, I think the negative outcomes associated with the virus (mass death; dying alone) are more vivid, more viscerally repellent than those associated with the economy (lagged rises in social unrest, substance abuse, domestic abuse, and violent crime that typically accompany prolonged mass unemployment) which tend to be more diffuse and less easily depicted.

What to do about all this? Well, before going any further, it seems worthwhile to test all of my assumptions. In the spirit of putting my money where my mouth is, here are a few falsifiable hypotheses:

  • The number of 'predictive headlines' (i.e., headlines that relay information about an event or state of the world that has yet to occur at the time of publication) has increased since the middle of March 2020. 
  • Of the predictive headlines that are falsifiable at present, more headlines will have overestimated threats than will have correctly estimated or underestimated threats. 
  • The more vivid the threat, the greater the magnitude of the error in prediction. 
  • News consumers exposed to more dire predictions will be more likely to take action (or intend to take action) than those exposed to less dire predictions. 
  • The inclusion of information about the confidence levels of experts (e.g., swapping out the word 'will' for the word 'could' or 'might') will have no effect on news consumers' behaviors or intentions. 
To motivate journalists and those who post on social media not to post speculative 'news,' perhaps we could shame the behavior with a catchy, albeit misleadingly reductive moniker: 'eventually fake news,' or something like that.

I can understand the desire to compulsively speculate at a time like this. Typically, there's a certain amount of uncertainty in the world. You might not know some of the details about what will happen over the next year, but you often have a rough idea of what it will be like, what you'll do, where you'll be. At a time when so much is uncertain, maybe we can't help ourselves from making predictions, even if we know most of them will turn out to be wrong. But even in times of great uncertainty, there must be something we can learn from our wrongness. Right?