The teleconference before the Challenger disaster – how the thinking shifted (Part Two)

Nick IngramThinking5 Comments

Shuttle Assembly Building at Kennedy Space Centre at Cape Canavarel Florida

This is part two of a post begun last week on the Challenger disaster. This post looks at the teleconference the night before the Challenger shuttle launch in 1986 and why the decision was made to launch even though the engineers were nervous. But to understand this post you need to understand about O-Rings on solid fuel rocket boosters and why they don’t “seat” well in the cold. I covered all this in last week’s post. Have a read of that post first.

The key problem – the “framing” of the conversation changed

My contention is that the “framing” of the conversation changed subtly in this teleconference the night before launch. That is, the implicit question the participants were trying to answer got reframed as the teleconference proceeded. As a result of that reframing, the conversation produced the wrong outcome.

What do I mean by the “frame” of the conversation? As a flight readiness conversation, the question it should have been trying to answer was “Prove to me that we should launch”. However, by the end of the teleconference it is very apparent that the question people are trying to answer is “Prove to me that we shouldn’t launch”.

How the teleconference unfolded

Background

The teleconference was called for 8:45pm on January 27th 1986 between the Marshall Space Flight Centre in Florida (part of NASA) and the contractor in Utah, Morton Thiokol, which had built the solid fuel rocket boosters. The shuttle was due to be launched the next morning, after having had its launch delayed by 24 hours because of strong winds. The teleconference had been called because Thiokol was concerned about the weather forecast showing temperatures below freezing for the shuttle launch the next day.

Remember, the coldest launch for the shuttle to date had been in January the year before at 53 degrees Fahrenheit. That launch had shown the most O-Ring damage so far. The prediction was now for a launch tomorrow with the temperature in the 20s.

8:45pm to 10:30pm – the first part of the teleconference

“away from goodness in the current database”

The teleconference opened with the Thiokol staff engineer Roger Boisjoly presenting his argument about the O-Rings. His basic argument was that the temperature was likely to prolong the time before the primary O-Ring would seat. He claimed that this is what accounted for the damage seen in the previous cold launch the year before (when it had been 53 degrees). (If the primary O-Ring didn’t seat in time the secondary O-Ring would be left to do the job, and there was other evidence that that secondary O-Ring was really not working effectively after about 600ms after launch.)

(By the way, the main reference I have for this teleconference is Claus Jensen’s book “Contest for the heavens. The road to the Challenger disaster”, Harvill, 1996. All references are to that book unless otherwise indicated.)

Boisjoly’s argument is simple. The colder the O Ring, the more hard and less responsive it is. And the more viscous is the grease around it. So it takes longer to seat – allowing gases to burn past it and erode it. He told the Roger’s Commission about this temperature effect: “it was easier to squeeze a sponge into a crack than it was to force a brick” (p304). Feynman, of course, went on to show the same thing (recall last week’s post).

It’s at this point that we see the framing start to shift. Boisjoly is asked to quantify his concern. Could he give a probability of failure? And of course he can’t – of the 25 launches, this is the coldest launch by far. All he can do is say that: “it was away from goodness in the current data base” (p304).

The teleconference proceeds, and it’s a this point that one of the more interesting characters gets his part. Robert K Lund, the VP of Engineering at Thiokol, the guy whose job it is to run the engineers and to engage with management, gives his opinion. He says Challenger should not be launched in temperatures below 53 degrees (this being the temperature of the next coldest launch in January the year before). His logic seems to be, we suspect cold has something to do with it, we got away with it at 53 degrees, let’s not risk anything colder.

And again we see the framing shift further. George Hardy, a senior guy at Marshall (NASA), gets agitated and says he is “appalled” (p305) by this recommendation from Lund. Another Marshall guy, Lawrence Mulloy, wants to know if Thiokol is saying that NASA should delay all launches until “next April” (p306).

The discussions continue. It seems that the engineers from Thiokol will stick to their guns. But, at 10:30pm a senior manager at Thiokol (not an engineer) calls for a halt so that they can caucus briefly on their own.

10:30pm to 11:00pm the Thiokol caucus

“Take off your engineering hat and put on your management hat”

Jerry Mason, a Senior VP at Thiokol, and the most senior manager present starts the Thiokol caucus by observing that the decision from here on will be “a management decision” (p 306) and that “Am I the only one who wants to fly?” (p 307).

It’s at this point we see a further shift, things get really tricky for Thiokol’s senior engineer Robert Lund. Mason (his ultimate boss) turns to Lund and asks him to “take off his engineering hat and put on his management hat” (p307). And Lund changes his mind. The questioning of Lund at the Roger’s Commission is worth quoting extensively, it is so illuminating and so painful (p307-308) (emphasis mine):

ROGERS: How do you explain the fact that you changed your mind when you changed your hat?

LUND: … We had to prove to them that we weren’t ready. And so we got ourselves in the thought process that we were trying to find some way to prove to them it [the launch] wouldn’t work…

ROGERS: In other words, you honestly believed that you had a duty to prove that it would not work?

LUND: Well, that is kind of the mode that we got ourselves into that evening. It seems like we have always been in the opposite mode. I should have detected that, but I did not, but the roles kind of switched.

Lund has put his finger on the shift in the frame that went on that night. And he was right in the middle of it.

Roger Boisjoly, the staff engineer from Thiokol, who was across all the detail, observed the shift in that caucus up close. He told the Roger’s Commission (my source here is a Senate Government report that quotes the Roger’s Commission) about how he and another engineer “spoke out and tried to explain once again the effects of low temperature”.  Boisjoly testified:

“Arnie [the other Thiokol engineer vehemently against the launch] actually got up from his position which was down the table, and walked up the table and put a quarter pad down in front of the table, in front of management folks, and tried to sketch out once again what his concern was with the joint, and when he realised he wasn’t getting through, he just stopped….

… I tried one more time with the photos… I also stopped when it was apparent I couldn’t get anybody to listen.”

11:00 pm – 11:15 pm the teleconference resumes and quickly finishes

Thiokol’s management got back on the teleconference with NASA’s Marshall Space Flight Centre and stated they were happy for the shuttle to fly. They faxed a signed rationale over. The rest of the conference lasted only 15 minutes.

A clear shift in the framing of the conversation

It seems pretty clear from the above description that the framing of the conversation shifted over the course of that teleconference. From “prove to me that we should launch” to “prove to me that we shouldn’t launch”. In a 2002 co-authored paper (in Science and Engineering Ethics, 2002, 8, pp 59-81), Roger Boisjoly, the staff engineer at the teleconference and the Thiokol caucus, describes this as a shift in the “burden of proof”:

“… it is hard to understand how those at NASA and Marshall could have thought the Challenger flight ready unless they presumed that unless the engineers could show that the flight would fail, then it would succeed.”

Why the shift?

Of course, all this begs the question why did the frame for the conversation shift? There are all kinds of possible answers here. Traditionally people seem to think that NASA exerted too much pressure on Thiokol. I think that’s unfair. All of these guys – NASA and Thiokol – were caught up in a system and a culture and a set of experiences.

I find a paper by William H Starbuck and Frances J Milliken, Challenger: Fine-tuning the odds until something breaks, more convincing. They see a gradual acclimatization to risk as the key issue here. Past successes breeding a complacency in NASA and the contractors. They have a very interesting observation from Milton Silveira (NASA’s chief engineer at the time) contrasting the Apollo program with the shuttle program:

“In the early days of the space program we were so damned uncertain of what we were doing that we always got everybody’s opinion. We would ask for continual reviews, continual scrutiny by anybody we had respect for, to look at this thing and make sure we were doing it right. As we started to fly the shuttle again and again, I think the system developed false confidence in itself and didn’t do the same thing”.

It seems to me that the rise of a “false confidence” was the key driver for the shift in that framing of the teleconference we saw that night.

So what can we learn as we design conversations to make decisions?

So what can we learn from all this? As we design conversations for our organisations to make important decisions, what are some principles we can derive from this whole Challenger experience? Let me offer a few:

  1. Be clear on the “framing” of the conversation to start with. In other words, be clear on the question you are trying to answer, or at least have a clear purpose for the conversation. A clear statement of principles can also be useful. If the participants in the teleconference had explicitly said “we are trying to prove that we shouldn’t launch” they would have immediately seen their problem. But it remained unstated (recall Lund’s words “I should have detected that, but I did not”).
  2. Present material as clearly as possible. Tufte, the information design guru, criticises the engineers at the teleconference for not presenting the relationship between temperature and O-Ring performance clearly enough. Having read Boisjoly’s response to Tufte, I think Tufte is being unfair. But the point remains. Make sure the information you are presenting to participants is as clear as possible. And put it in a format where participants don’t have to hold it in short term memory – this is where posters are good.
  3. Get rid of “rank”. As much as possible, get rid of rank in the conversation. Phrases like “management decision” and “management hat” are really problematic. A conversation should allow all participants to contribute equally. If they have enough expertise to be in the conversation then they should be heard equally.
  4. Choose your time of day. Don’t hold a conversation on whether to launch a space shuttle at 10:30 at night. No one is at their best at this stage. People are tired – mentally and emotionally stretched. They are unable to hold all the elements in their heads, so they can’t think critically, and they are likely to take shortcuts. Lund’s words “I should have detected that, but I did not” ring true again. As does Boisjoly’s observation that people had “stopped listening”. That’s what people do at 10:30 at night. So, think about when people will be at their best and design the conversation accordingly.
  5. Don’t use video or teleconferencing if you can avoid it. I know TED have just done their first hologram talk… but give me a break. I’ve done enough conversations on video and telephone link to know that even in 2016 it’s still wretchedly suboptimal.

Let me know what you think

Let me know what you think. Are their other lessons to take away from this teleconference?

If you enjoyed reading this blog, subscribe to the email so you never miss a post (it comes out about once a week). Hit the “subscribe” button in the top right of your screen.

5 Comments on “The teleconference before the Challenger disaster – how the thinking shifted (Part Two)”

  1. Nowhere, in all the discussions I have seen has the possibility of killing seven people been considered. Perhaps the discussions should have been steered toward possible charges of manslaughter should the O-rings fail.

  2. Given hindsight, it’s easy to be critical of the “management hat” people who got the decision changed. But I understand their perspective: nothing is without risk, especially not spaceflight. And if the shuttle never launched or launches kept getting delayed, the program could have have been shut down. They were dealing with political pressure from Congress members who thought NASA was a waste of money.

    In that context, constantly deciding not to launch could easily have ended the shuttle program with nothing to replace it. Obviously in this case they should have delayed the launch but the shuttle program was already suffering from significantly fewer launches than promised. This was leading them to take risks in order to try and keep the launches going. The deeper problem was that the shuttle was a mistake from the beginning – it was a disastrous decision to replace the Apollo technology which had landed humans on the moon with something needlessly complicated that only got a fraction of the distance of its predecessor.

  3. Engineering of safety critical systems always involves risk, so saying that nothing is without risk isn’t really clarifying the analysis. The evidence we have on group decision making, groupthink and risky shift, as well as conservative shift suggests that discussions and conclusions can be impacted quite subtly by the management of the discussion. It takes a very clear head to challenge people who are changing the question posed (framing) and I would guess most engineers have found themselves in the situation where they feel others aren’t listening to or considering their views. Projecting back the responsibility for the potential loss of life in terms of tolerability of risk is a useful consideration but equally the impact on continuation of the programme. I liked the quote about earlier programmes where they appealed to the wisdom of crowds, canvassing far and wide. That clearly reflects that difference between dismissive sidelining and active consideration. This makes the emotional tone of such meetings an important barometer of effective decision making. Do we train engineers to effectively persuade others of their position and should we find fault in their presentation, it takes time to develop effective presentation materials. Stop, think, reflect, when we press on regardless it is usually a warning.

  4. Diane Vaughan wrote an exhaustive book on The Challenger Launch Decision, in which she discusses what Nick calls “a gradual acclimatization to risk” under the phrase “normalization of deviance.” The thinking about different risks had been done long before January 27, 1986, and the O-rings had been labeled Criticality Level 1 (loss of life risk). But NASA and Thiokol had repeatedly waived this risk in order to continue launching shuttles (even before the Challenger Explosion). This repeated waiver is what Vaughan has in mind when she refers to normalization of deviance.
    The political and financial pressures were enormous. (President Reagan was scheduled to give the State of the Union address on January 28, and the White House wanted him to be able to brag about the first school teacher in space. Thiokol’s was seeking a new contract, which meant the jobs of senior managers could be in jeopardy. https://www.washingtonpost.com/archive/politics/1986/02/27/thiokol-was-seeking-new-contract-when-officials-approved-launch/db2ab502-9723-4f90-afd0-8c75bf2237b4/)
    After 15 years working in the judgment and decision-making field, I have concluded that every decision has four essential components (whether or not consciously considered): Relevant facts (data on O-rings, criticality level, current temperature, weather forecasts, etc.). Values, goals, and interests (White House desires, Thiokol contract, sanctity of human life, etc.). Available Options (postpone launch until weather is warmer, stop launching until O-Ring problem solved, go ahead with the launch, etc.). Possible/Probable Future Consequences (everything could go well, the shuttle could explode, there could be an inquiry, senior managers could lose their jobs, the shuttle program could be shut down, etc.).
    Nick’s post reminds us that the quality of decisions is measured by the sufficiency of deliberation and not by outcomes.

Leave a Reply

Your email address will not be published.