Tuesday, July 14, 2015

When New Horizons Halted

As the New Horizons spacecraft flies by Pluto today, it is collecting and sending back to Earth data from its many sensors.  But this success almost didn't happen.

On Saturday, July 11, The Washington Post had an article about the crisis that occurred just a week earlier (July 4).

The story illustrates a couple of key ideas in decision making and risk management.

First, the loss of contact occurred because the spacecraft was programmed with a contingency plan: if something goes wrong, then go to safe mode: switch to the backup computer, turn off the main computer and other instruments, start a controlled spin to make navigation easier, and start transmitting on another frequency.  A contingency plan is a great way to manage risk.

Second, fixing this situation required the New Horizons operations team to manage an "issue," not a "risk," because the problem had already occurred (it was not a potential problem).

Finally, after diagnosing the problem and re-establishing contact with the spacecraft, the team had to make a key decision: whether to stick with the backup computer or switch back to the main computer (which had become overloaded, causing the crisis).  Here, they displayed some risk aversion (not surprising considering the one-shot chance to observe Pluto): they went back to the main computer because they "trusted [it], knew its quirks, had tested it repeatedly."

Congratulations to all of the engineers, scientists, and technicians who designed, built, and operate the New Horizons spacecraft!


No comments:

Post a Comment