Wednesday 19 November 2014

Nate Silver Signal Noise

I don't know if it's The Blob, old age, a full teaching schedule or the misplacing of my glasses but I don't read books like I used to.  From about the age of 13, when I learned to read, I've been ploughing through 2 to 4 books a week.  This is nothing to Dau.I who knocks off a book-a-day and has done since she learned to read in the month after her sixth birthday.  But it's a habit that has been quite abruptly shaken loose over the last couple of years. During that time, I'd pick up a book and almost immediately fall into a slobbery doze; and when I woke a short while later I'd have to start that section again.  It was much easier to write 600 words than read them.

Nevertheless in 2014 I have struggled through The Signal and the Noise: The Art and Science of Prediction by Nate Silver.  It has taken me about 10 months which is a pathetic 2 pages a day.  Part of the problem is that the 450 pages of text are printed in a mean minuscule font that requires bright lights and extra levels of concentration to get into the run of it.  Despite this, I was never tempted to fire the book out of the window: I knew I was getting gold and it was worth sticking at the project. What Silver does is apply mathematics, experience and probability to a number of problems which hinge on telling the future.  How good are we at forecasting:
  • the weather (really rather good over the next 4 or 5 days)
  • determining when earthquakes will happen (really no better than guesswork)
  • who will win this Saturday's sport's fixture (fair if you work hard at having a good model)
  • rise and fall in the stock market (hopeless unless you are trading inside)
He has the best explanation of how the currently trendy Bayesian statistics work in the real world, so I'm grateful for that alone.  Bayesian stats require you to do something that is, on the narrow face of it, a bit unscientific: you are required to have a punt at what you think the outcome of a test or experiment will be, rather than like Darwin "I worked on true Baconian principles, and without any theory collected facts".  The results of the test are used to modify your 'priors' to come closer to the truth. It's mathematically very simple and quite powerful in its predictive power - IF you are good at guesstimating what the likelihoods of several related events are.  I think we can get much better at these estimates if we practice making them.

Typical of the book is an analysis of just how unexpected the tragedy of 9/11 was and the answer is: well actually could/should have been expected. In 2002 Donald Rumsfeld was pilloried for his WTF? oracular statement "there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know."  But the idea of unknown unknowns is key to making sense of a complex and potentially violent world.  It's the core, and indeed the title, of Nassim Taleb's 2007 book The Black Swan. A lot of people, who should have done better to protect US citizens, stoutly maintained that the 9/11 attacks were unknown-unknowns - so far beyond imaginings in scope that they couldn't (nobody could) have predicted such a Series of Unfortunate Events. Nate Silver shows in two graphs that this is a wretched excuse for dereliction of duty.  That's why the American tax-payer pays the big bucks to the TLAs (three-letter acronyms) NSA, CIA, FBI: so that they have some big ideas and deliver big solutions.  Here's the first one which indicates that, in 30 years, we have had one several-thousand-killer event in the catalogue of terrist attacks against NATO countries.  It's way out there on the bottom right:
Xeroxed straight out of S&N, so a little skew.  On the following page, Silver plots "# of attacks killing this many people" vs "# of fatalities" just as above except on a log scale for both axes. Extrapolating the now straightish line down and to the right, it shows that an attack taking out 2500 people, far from being unimaginable, was to be expected about once every 80 years. Or a lot more likely than a Richter 8.2 earthquake in California.

The most readable chapter tells how, in his 20s, Silver dropped out of a well paid job as an economic consultant for KPMG and made a modest fortune at poker. He got good at it because he worked really hard for several years getting to know when to fold and when to hold. This skill hinges in part on knowing the basic probabilities but the money lies in getting to predict what other players will do given a particular hand.  I've picked up on the language to this extent: when I'm in class getting the students to think about what the outcome of an experiment, I'll say something like "I lay €5 on the test-tube turning pink".  If I was unscrupulous and needed money, I'd actually slap folding money down on the bench-top, because my knowledge of biology (the priors) having been in the game for 4 decades is much more extensive than theirs who haven't yet served for four years.

I won't drop any more spoilers but urge you to lay hints among your loved ones to get you S&N for Christmas . . . and also suggest that a magnifying glass would be handy too. While waiting, you might check out Silver's blog FiveThirtyEight which is heavy on politics [538 is the number of seats in the US Electoral College] but has thoughtful essays on scientific and other issues.

No comments:

Post a Comment