The teleconference before the Challenger disaster – how the thinking shifted (Part Two)
This is part two of a post begun last week on the Challenger disaster. This post looks at the teleconference the night before the Challenger shuttle launch in 1986 and why the decision was made to launch even though the engineers were nervous. But to understand this post you need to understand about O-Rings on solid fuel rocket boosters and why they don’t “seat” well in the cold. I covered all this in last week’s post. Have a read of that post first.
The key problem – the “framing” of the conversation changed
My contention is that the “framing” of the conversation changed subtly in this teleconference the night before launch. That is, the implicit question the participants were trying to answer got reframed as the teleconference proceeded. As a result of that reframing, the conversation produced the wrong outcome.
What do I mean by the “frame” of the conversation? As a flight readiness conversation, the question it should have been trying to answer was “Prove to me that we should launch”. However, by the end of the teleconference it is very apparent that the question people are trying to answer is “Prove to me that we shouldn’t launch”.
How the teleconference unfolded
Background
The teleconference was called for 8:45pm on January 27th 1986 between the Marshall Space Flight Centre in Florida (part of NASA) and the contractor in Utah, Morton Thiokol, which had built the solid fuel rocket boosters. The shuttle was due to be launched the next morning, after having had its launch delayed by 24 hours because of strong winds. The teleconference had been called because Thiokol was concerned about the weather forecast showing temperatures below freezing for the shuttle launch the next day.
Remember, the coldest launch for the shuttle to date had been in January the year before at 53 degrees Fahrenheit. That launch had shown the most O-Ring damage so far. The prediction was now for a launch tomorrow with the temperature in the 20s.
8:45pm to 10:30pm – the first part of the teleconference
“away from goodness in the current database”
The teleconference opened with the Thiokol staff engineer Roger Boisjoly presenting his argument about the O-Rings. His basic argument was that the temperature was likely to prolong the time before the primary O-Ring would seat. He claimed that this is what accounted for the damage seen in the previous cold launch the year before (when it had been 53 degrees). (If the primary O-Ring didn’t seat in time the secondary O-Ring would be left to do the job, and there was other evidence that that secondary O-Ring was really not working effectively after about 600ms after launch.)
(By the way, the main reference I have for this teleconference is Claus Jensen’s book “Contest for the heavens. The road to the Challenger disaster”, Harvill, 1996. All references are to that book unless otherwise indicated.)
Boisjoly’s argument is simple. The colder the O Ring, the more hard and less responsive it is. And the more viscous is the grease around it. So it takes longer to seat – allowing gases to burn past it and erode it. He told the Roger’s Commission about this temperature effect: “it was easier to squeeze a sponge into a crack than it was to force a brick” (p304). Feynman, of course, went on to show the same thing (recall last week’s post).
It’s at this point that we see the framing start to shift. Boisjoly is asked to quantify his concern. Could he give a probability of failure? And of course he can’t – of the 25 launches, this is the coldest launch by far. All he can do is say that: “it was away from goodness in the current data base” (p304).
The teleconference proceeds, and it’s a this point that one of the more interesting characters gets his part. Robert K Lund, the VP of Engineering at Thiokol, the guy whose job it is to run the engineers and to engage with management, gives his opinion. He says Challenger should not be launched in temperatures below 53 degrees (this being the temperature of the next coldest launch in January the year before). His logic seems to be, we suspect cold has something to do with it, we got away with it at 53 degrees, let’s not risk anything colder.
And again we see the framing shift further. George Hardy, a senior guy at Marshall (NASA), gets agitated and says he is “appalled” (p305) by this recommendation from Lund. Another Marshall guy, Lawrence Mulloy, wants to know if Thiokol is saying that NASA should delay all launches until “next April” (p306).
The discussions continue. It seems that the engineers from Thiokol will stick to their guns. But, at 10:30pm a senior manager at Thiokol (not an engineer) calls for a halt so that they can caucus briefly on their own.
10:30pm to 11:00pm the Thiokol caucus
“Take off your engineering hat and put on your management hat”
Jerry Mason, a Senior VP at Thiokol, and the most senior manager present starts the Thiokol caucus by observing that the decision from here on will be “a management decision” (p 306) and that “Am I the only one who wants to fly?” (p 307).
It’s at this point we see a further shift, things get really tricky for Thiokol’s senior engineer Robert Lund. Mason (his ultimate boss) turns to Lund and asks him to “take off his engineering hat and put on his management hat” (p307). And Lund changes his mind. The questioning of Lund at the Roger’s Commission is worth quoting extensively, it is so illuminating and so painful (p307-308) (emphasis mine):
ROGERS: How do you explain the fact that you changed your mind when you changed your hat?
LUND: … We had to prove to them that we weren’t ready. And so we got ourselves in the thought process that we were trying to find some way to prove to them it [the launch] wouldn’t work…
ROGERS: In other words, you honestly believed that you had a duty to prove that it would not work?
LUND: Well, that is kind of the mode that we got ourselves into that evening. It seems like we have always been in the opposite mode. I should have detected that, but I did not, but the roles kind of switched.
Lund has put his finger on the shift in the frame that went on that night. And he was right in the middle of it.
Roger Boisjoly, the staff engineer from Thiokol, who was across all the detail, observed the shift in that caucus up close. He told the Roger’s Commission (my source here is a Senate Government report that quotes the Roger’s Commission) about how he and another engineer “spoke out and tried to explain once again the effects of low temperature”. Boisjoly testified:
“Arnie [the other Thiokol engineer vehemently against the launch] actually got up from his position which was down the table, and walked up the table and put a quarter pad down in front of the table, in front of management folks, and tried to sketch out once again what his concern was with the joint, and when he realised he wasn’t getting through, he just stopped….
… I tried one more time with the photos… I also stopped when it was apparent I couldn’t get anybody to listen.”
11:00 pm – 11:15 pm the teleconference resumes and quickly finishes
Thiokol’s management got back on the teleconference with NASA’s Marshall Space Flight Centre and stated they were happy for the shuttle to fly. They faxed a signed rationale over. The rest of the conference lasted only 15 minutes.
A clear shift in the framing of the conversation
It seems pretty clear from the above description that the framing of the conversation shifted over the course of that teleconference. From “prove to me that we should launch” to “prove to me that we shouldn’t launch”. In a 2002 co-authored paper (in Science and Engineering Ethics, 2002, 8, pp 59-81), Roger Boisjoly, the staff engineer at the teleconference and the Thiokol caucus, describes this as a shift in the “burden of proof”:
“… it is hard to understand how those at NASA and Marshall could have thought the Challenger flight ready unless they presumed that unless the engineers could show that the flight would fail, then it would succeed.”
Why the shift?
Of course, all this begs the question why did the frame for the conversation shift? There are all kinds of possible answers here. Traditionally people seem to think that NASA exerted too much pressure on Thiokol. I think that’s unfair. All of these guys – NASA and Thiokol – were caught up in a system and a culture and a set of experiences.
I find a paper by William H Starbuck and Frances J Milliken, Challenger: Fine-tuning the odds until something breaks, more convincing. They see a gradual acclimatization to risk as the key issue here. Past successes breeding a complacency in NASA and the contractors. They have a very interesting observation from Milton Silveira (NASA’s chief engineer at the time) contrasting the Apollo program with the shuttle program:
“In the early days of the space program we were so damned uncertain of what we were doing that we always got everybody’s opinion. We would ask for continual reviews, continual scrutiny by anybody we had respect for, to look at this thing and make sure we were doing it right. As we started to fly the shuttle again and again, I think the system developed false confidence in itself and didn’t do the same thing”.
It seems to me that the rise of a “false confidence” was the key driver for the shift in that framing of the teleconference we saw that night.
So what can we learn as we design conversations to make decisions?
So what can we learn from all this? As we design conversations for our organisations to make important decisions, what are some principles we can derive from this whole Challenger experience? Let me offer a few:
- Be clear on the “framing” of the conversation to start with. In other words, be clear on the question you are trying to answer, or at least have a clear purpose for the conversation. A clear statement of principles can also be useful. If the participants in the teleconference had explicitly said “we are trying to prove that we shouldn’t launch” they would have immediately seen their problem. But it remained unstated (recall Lund’s words “I should have detected that, but I did not”).
- Present material as clearly as possible. Tufte, the information design guru, criticises the engineers at the teleconference for not presenting the relationship between temperature and O-Ring performance clearly enough. Having read Boisjoly’s response to Tufte, I think Tufte is being unfair. But the point remains. Make sure the information you are presenting to participants is as clear as possible. And put it in a format where participants don’t have to hold it in short term memory – this is where posters are good.
- Get rid of “rank”. As much as possible, get rid of rank in the conversation. Phrases like “management decision” and “management hat” are really problematic. A conversation should allow all participants to contribute equally. If they have enough expertise to be in the conversation then they should be heard equally.
- Choose your time of day. Don’t hold a conversation on whether to launch a space shuttle at 10:30 at night. No one is at their best at this stage. People are tired – mentally and emotionally stretched. They are unable to hold all the elements in their heads, so they can’t think critically, and they are likely to take shortcuts. Lund’s words “I should have detected that, but I did not” ring true again. As does Boisjoly’s observation that people had “stopped listening”. That’s what people do at 10:30 at night. So, think about when people will be at their best and design the conversation accordingly.
- Don’t use video or teleconferencing if you can avoid it. I know TED have just done their first hologram talk… but give me a break. I’ve done enough conversations on video and telephone link to know that even in 2016 it’s still wretchedly suboptimal.
Let me know what you think
Let me know what you think. Are their other lessons to take away from this teleconference?
If you enjoyed reading this blog, subscribe to the email so you never miss a post (it comes out about once a week). Hit the “subscribe” button in the top right of your screen.