Monday, 25 March 2013

When 200 = 100,000

Last night I was watching The Challenger, a BBC docudrama starring William Hurt as Richard Feynman on the Presidential Commission that investigated the first Space Shuttle disaster in 1986.  That's the one that blew up a minute after launch because a huge rubber washer failed to keep hot gas where it should be.  Feynman, according to the story/legend/record, is the only one who really wants to find out what happened - so that it won't ever happen again but also for the pleasure of finding out.  As a scientist he talks almost the same language as the engineers who designed the Shuttle and as a transparently honest person he has little time for the NASA spokesmen who seem bent (sic) on laundering the story so that they and their institution come out squeaky clean. There are two interesting aspects of the story as portrayed by the Beeb. 

Firstly, Feynman found other faults in the technical design of the spaceship apart from the famous O-rings.  He discovers metal fatigue in the blades of the turbines and there is a nod to the future (specifically the next Shuttle disaster) when Feynman runs his hand over the heat-shield tiles which failed to protect Columbia in 2003.  So some of the complexity of the design, the engineering, and solving the cause of the disaster is addressed.  If not the O-rings something else could have failed.

Secondly, the drama focuses on the disconnect between what the engineers at NASA and its sub-contractors were saying about safety and reliability and what the NASA management was hearing them say about safety and reliability.  The engineers conclude that the Shuttle is going to work as specified 99.6% of the time, which being interpreted means that it's going to fail about 1 time in 200.  The management OTOH maintain during the public raree-show for the media that their estimate of failure is about 1:10^5 which being interpreted means that it's going to fail about 1 time in 100,000.  To most of us 99.6% reliable sounds pretty darned reliable and 99.999% reliable is a little bit better. 

Feynman was charismatically famous, as only the very best scientists are, for being able to explain complex issues to ordinary folks. His rhetorical question to the management graphically exposes the absurdity of their position "So you're saying that NASA can launch a Shuttle every day for 300 years and not expect it to go wrong?"

Unfortunately for the astronauts and their families, even the famously pragmatic engineers were optimistic about the safety and reliability of their baby and its 2.5 million components.  The Shuttle lumbered along for 30 years from 1981-2011, had 135 launches and fucked-up twice.  Failure rate 1 in 70, reliability 98.5%.  Take the bus?

No comments:

Post a Comment