A space expert warns NASA's safety culture may be eroding again
JAMES OBERG
06 AUG 2021
Russia's "Nauka" Multipurpose Laboratory Module is pictured shortly after docking to the Zvezda service module's Earth-facing port on the International Space Station, with the Brazilian coast 263 miles below. In the foreground is the Soyuz MS-18 crew ship docked to the Rassvet module on 29 July 2021.
NASA
This is a guest post. The views expressed here are solely those of the author and do not represent positions of IEEE Spectrum or the IEEE.
In an International Space Station major milestone more than fifteen years in the making, a long-delayed Russian science laboratory named Nauka automatically docked to the station on 29 July, prompting sighs of relief in the Mission Control Centers in Houston and Moscow. But within a few hours, it became shockingly obvious the celebrations were premature, and the ISS was coming closer to disaster than at anytime in its nearly 25 years in orbit.
While the proximate cause of the incident is still being unravelled, there are worrisome signs that NASA may be repeating some of the lapses that lead to the loss of the Challenger and Columbia space shuttles and their crews. And because political pressures seem to be driving much of the problem, only an independent investigation with serious political heft can reverse any erosion in safety culture.
Let's step back and look at what we know happened: In a cyber-logical process still not entirely clear, while passing northwest to southeast over Indonesia, the Nauka module's autopilot apparently decided it was supposed to fly away from the station. Although actually attached, and with the latches on the station side closed, the module began trying to line itself up in preparation to fire its main engines using an attitude adjustment thruster. As the thruster fired, the entire station was slowly dragged askew as well.
Since the ISS was well beyond the coverage of Russian ground stations, and since the world-wide Soviet-era fleet of tracking ships and world-circling network of "Luch" relay comsats had long since been scrapped, and replacements were slow in coming, nobody even knew Nauka was firing its thruster, until a slight but growing shift in the ISS's orientation was finally detected by NASA.
Within minutes, the Flight Director in Houston declared a "spacecraft emergency"—the first in the station's lifetime—and his team tried to figure out what could be done to avoid the ISS spinning up so fast that structural damage could result. The football-field-sized array of pressurized modules, support girders, solar arrays, radiator panels, robotic arms, and other mechanisms was designed to operate in a weightless environment. But it was also built to handle stresses both from directional thrusting (used to boost the altitude periodically) and rotational torques (usually to maintain a horizon-level orientation, or to turn to a specific different orientation to facilitate arrival or departure of visiting vehicles). The juncture latches that held the ISS's module together had been sized to accommodate these forces with a comfortable safety margin, but a maneuver of this scale had never been expected.
Meanwhile, the station's automated attitude control system had also noted the deviation and began firing other thrusters to countermand it. These too were on the Russian half of the station. The only US orientation-control system is a set of spinning flywheels that gently turn the structure without the need for thruster propellant, but which would have been unable to cope with the unrelenting push of Nauka's thruster. Later mass-media scenarios depicted teams of specialists manually directing on-board systems into action, but the exact actions taken in response still remain unclear—and probably were mostly if not entirely automatic. The drama continued as the station crossed the Pacific, then South America and the mid-Atlantic, finally entering Russian radio contact over central Europe an hour after the crisis had begun. By then the thrusting had stopped, probably when the guilty thruster exhausted its fuel supply. The sane half of the Russian segment then restored the desired station orientation.
Initial private attempts to use telemetry data to visually represent the station's tumble that were posted online looked bizarre, with enormous rapid gyrations in different directions. Mercifully, the truth of the situation is that the ISS went through a simple long-axis spin of one and a half full turns, and then a half turn back to the starting alignment. The jumps and zig-zags were computational artifacts of the representational schemes used by NASA, which relate to the concept of "gimbal lock" in gyroscopes.
How close the station had come to disaster is an open question, and the flight director humorously alluded to it in a later tweet that he'd never been so happy as when he saw on external TV cameras that the solar arrays and radiators were still standing straight in place. And any excessive bending stress along docking interfaces between the Russian and American segments would have demanded quick leak checks. But even if the rotation was "simple," the undeniably dramatic event has both short term and long-term significance for the future of the space station. And it has antecedents dating back to the very birth of the ISS in 1997.
How close the ISS had come to disaster is still an open question.
At this point, unfortunately, is when the human misjudgments began to surface. To calm things down, official NASA spokesmen provided very preliminary underestimates in how big and how fast the station's spin had been. These were presented without any caveat that the numbers were unverified—and the real figures turned out to be much worse. The Russian side, for its part, dismissed the attitude deviation as a routine bump in a normal process of automatic docking and proclaimed there would be no formal incident investigation, especially any that would involve their American partners. Indeed, both sides seemed to agree that the sooner the incident was forgotten, the better. As of now, the US side is deep into analysis of induced stresses on critical ISS structures, with the most important ones, such as the solar arrays, first. Another standard procedure after this kind of event is to assess potential indicators of stress-induced damage, especially in terms of air leaks, and where best to monitor cabin pressure and other parameters to detect any such leaks.
The bureaucratic instinct to minimize the described potential severity of the event needs cold-blooded assessment. Sadly, from past experience, this mindset of complacency and hoping for the best is the result of natural human mental drift that comes when there are long periods of apparent normalcy. Even if there is a slowly emerging problem, as long as everything looks okay in the day to day, the tendency is ignore warning signals as minor perturbations. The safety of the system is assumed rather than verified—and consequently managers are led into missing clues, or making careless choices, that lead to disaster. So these recent indications of this mental attitude about the station's attitude are worrisome. The NASA team has experienced that same slow cultural rot of assuming safety several times over the past decades, with hideous consequences. Team members in the year leading up to the 1986 Challenger disaster (and I was deep within the Mission Control operations then) had noticed and begun voicing concerns over growing carelessness and even humorous reactions to occasional "stupid mistakes," without effect. Then, after imprudent management decisions, seven people died.
The same drift was noticed in the late 1990s, especially in the joint US/Russian operations on Mir and on early ISS flights. It led to the forced departure of a number of top NASA officials, who had objected to the trend that was being imposed by the White House's post-Cold War diplomatic goals, implemented by NASA Administrator Dan Goldin. Safety took a decidedly secondary priority to international diplomatic value. Legendary Mission Control leader Gene Kranz described the decisions that were made in the mid-1990s over his own objections, objections that led to his sudden departure from NASA. "Russia was subsequently assigned partnership responsibilities for critical in-line tasks with minimal concern for the political and technical difficulties as well as the cost and schedule risks," he wrote in 1999. "This was the first time in the history of US manned space flight that NASA assigned critical path, in-line tasks with little or no backup." By 2001-'02, the results were as Kranz and his colleagues had warned. "Today's problems with the space station are the product of a program driven by an overriding political objective and developed by an ad hoc committee, which bypassed NASA's proven management and engineering teams," he concluded.
To reverse the apparent new cultural drift, NASA headquarters or some even higher office is going to have to intervene.
By then the warped NASA management culture that soon enabled the Columbia disaster in 2003 was fully in place. Some of the wording in current management proclamations regarding the Nauka docking have an eerie ring of familiarity. "Space cooperation continues to be a hallmark of U.S.-Russian relations and I have no doubt that our joint work reinforces the ties that have bound our collaborative efforts over the many years" wrote NASA Director Bill Nelson to Dmitry Rogozin, head of the Russian space agency, on July 31. There was no mention of the ISS's first declared spacecraft emergency, nor any dissatisfaction with Russian contribution to it.
To reverse the apparent new cultural drift, and thus potentially forestall the same kind of dismal results as before, NASA headquarters or some even higher office is going to have to intervene. The causes of the Nauka-induced "space sumo match" of massive cross-pushing bodies need to be determined and verified. And somebody needs to expose the decision process that allowed NASA to approve the ISS docking of a powerful thruster-equipped module without the on-site real-time capability to quickly disarm that system in an emergency. Because the apparent sloppiness of NASA's safety oversight on visiting vehicles looks to be directly associated with maintaining good relations with Moscow, the driving factor seems to be White House diplomatic goals—and that's the level where a corrective impetus must originate. With a long-time U.S. Senate colleague, Nelson, recently named head of NASA, President Biden is well connected to issue such guidance for a thorough investigation by an independent commission, followed by implementation of needed reforms. The buck stops with him.
As far as Nauka's role in this process of safety-culture repair, it turns out that quite by bizarre coincidence, a similar pattern was played out by the very first Russian launch that inaugurated the ISS program, the 'Zarya' module [called the 'FGB'] in late 1997. Nauka turns out to be the repeatedly rebuilt and upgraded backup module for that very launch, and the parallels are remarkable. The day the FGB was launched, on 23 November 1997, the mission faced disaster when it refused to accept ground commands to raise its original atmosphere-skimming parking orbit. As it crossed over Russian ground sites, controllers in Moscow sent commands, and the spacecraft didn't answer. Meanwhile, NASA guests at a nearby facility were celebrating with Russian colleagues as nobody told them of the crisis. Finally, on the last available in-range pass, controllers tried a new command format that the onboard computer did recognize and acknowledge. The mission—and the entire ISS project—was saved, and the American side never knew. Only years later did the story appear in Russian newspapers.
Still, for all its messy difficulties and frustrating disappointments, the U.S./Russian partnership turned out to be a remarkably robust "mutual co-dependence" arrangement, when managed with "tough love." Neither side really had practical alternatives if it wanted a permanent human presence in space, and they still don't—so both teams were devoted to making it work. And it could still work—if NASA keeps faith with its traditional safety culture and with the lives of those astronauts who died in the past because NASA had failed them.
Postscript: As this story was going to press, a NASA spokesperson responded to queries about the incident saying:
As shared by NASA's Kathy Lueders and Joel Montalbano in the media telecon following the event, Roscosmos regularly updated NASA and the rest of the international partners on MLM's progress during the approach to station. We continue to have confidence in our partnership with Roscosmos to operate the International Space Station. When the unexpected thruster firings occurred, flight control teams were able to enact contingency procedures and return the station to normal operations within an hour. We would point you to Roscosmos for any specifics on Russian systems/performance/procedures.
James Oberg is a retired "rocket scientist" in Texas, after a 20+ year career in NASA Mission Control and subsequently an on-air space consultant for ABC News and then NBC News. The author of a dozen books and hundreds of magazine articles on the past, present, and potential future of space exploration, he has reported from space launch and operations centers across the United States and Russia and North Korea. His home page is www.jamesoberg.com.
An unusual day aboard the ISS.
By Mark Kaufman on August 6, 2021
The International Space Station orbiting above Earth. Credit: Nasa
The International Space Station flipped over on its back on July 29.
This was a significant, though fortunately not disastrous, nearly one-hour episode for humanity's largest and oldest space outpost. The station slowly turned over one and a half times. (Or as NASA describes it, the space station experienced a "total attitude change" of around 540 degrees, with "attitude" being jargon for a spacecraft's orientation.) The new Russian module "Nauka" had docked to the sizeable 356-foot-long station, but Nauka's thrusters fired when they shouldn't have, causing the space station to start unexpectedly spinning.
"It was quite an event," said Keith Crisman, an assistant professor of space studies at the University of North Dakota who researches safety systems for human spaceflight. "It was a potentially serious issue," Crisman added, noting that an out-of-control spacecraft is one of the highest-risk events in space.
Later on July 29, after flight engineers had righted the space station, NASA held a media briefing to address the unusual event. The agency's summary: All is OK, the space station had returned to normal, and nobody aboard was in danger. In fact, a NASA public affairs officer said in an email that the station's spin was "slow enough to go unnoticed by the crew members on board" (until they received warning messages), and everything else operated normally.
While it's fortunate the astronauts and cosmonauts aboard are OK, the event still carries questions about what happened, along with future concerns about space station safety. "When spacecraft misfire it's a serious thing."
"When spacecraft misfire it's a serious thing," said Jonathan McDowell, an astronomer at the Harvard-Smithsonian Center for Astrophysics who tracks rocket and spacecraft launches. "I cannot imagine there aren't some very serious conversations going on at NASA."
What went wrong
As noted above, thrusters on Nauka started firing after the module docked to the space station, forcing the station to (slowly) spin at a maximum of half a degree per second. It ended up upside down, before the correction. On spacecraft, these types of misfires do sometimes happen, and more easily than engineers would like, explained Kurt Anderson, a professor of mechanical aerospace engineering at Rensselaer Polytechnic Institute.
In 2016, for example, a thruster on Japan's 46-foot long astronomy satellite Hitomi misfired. Hitomi spun uncontrollably and broke apart. And perhaps most famously, the small spacecraft Gemini VIII (piloted by legendary astronaut Neil Armstrong in 1966) violently spun out of control after a thruster problem, but Armstrong impressively stopped the wild tumble, and narrowly avoided national tragedy.
(The space station, fortunately, is a big, over 925,000-pound object with lots of material to turn, so Nauka's thrusters didn't have a chance to get the station spinning treacherously.)
Life on the space station in 2020. Credit: NASA
But what triggered Nauka's mishap? A software glitch likely played a role. Nauka experienced some minor software issues before arriving at the space station, noted Crisman. The day after the unexpected flip, the Russian space agency Roscosmos officially blamed the event on a software glitch, causing thrusters to fire out and try to withdraw Nauka, which had just docked several hours before. That's as much as we currently know, which comes from a four-paragraph Roscosmos press release.
After the thrusters began misfiring, the space station soon entered an official "Loss of Attitude Control," the NASA representative told Mashable. The blasting thrusters were countered by other space station thrusters firing in the opposite direction to regain the station's normal orientation, NASA said. The station had to flip completely over — by 180 degrees — to right itself.
Real concerns
The space station excitement comes with some notable concerns, according to experts outside of NASA.
1. The space station is old and not meant for acrobatics. People first inhabited the station over two decades ago. "The ISS is an older piece of equipment. We call it legacy equipment," said Crisman.
It's not a spry vehicle intended to flip around, though the flipping in this case wasn't nearly violent. However, the station experienced thrusters fighting with each other for control of the craft, noted McDowell, which is undoubtedly somewhat strenuous for a spacecraft with attached instruments, like huge solar arrays branching out from the station. "You've got torque on relatively old parts," explained Crisman.
2. Wasted precious fuel: Stopping the station's flip required firing propellant from thrusters, which is problematic because propellant in space is finite, and at times necessary in order to maneuver the space station. There's no other way to purposefully move. "Propellant is blood."
"Propellant is blood," said Anderson. Unlike for most spacecraft, however, NASA can launch more propellant to the space station, though at a cost.
3. The misfiring thrusters couldn't immediately be turned off. To stop the station from spinning, ground control operators in Russia needed to tell the automated Nauka to stop firing. But this didn't work, necessitating the counter-thrusting. "That they couldn't get the thrusters shut off immediately bothers me," said the aerospace engineer Anderson.
4. Things could have been worse — much, much worse.
Any mishap on the space station has the benefit of happening under the watchful eye of NASA's space station team in Houston. "They have a really excellent flight control team," said McDowell, of the Center for Astrophysics.
NASA's flight controllers, like flight director Zebulon Scoville, immediately noticed the station's unexpected behavior, and soon declared a "spacecraft emergency."
Yet there shouldn't have ever been an emergency, emphasized Crisman. Yes, errors are inevitable, but the system shouldn't allow such an issue to percolate down into a potentially serious, active problem. "We should have systems in place to mitigate those errors," he said.
Broadly, these systems should follow the "Swiss Cheese Model," Crisman explained. Different layers of naturally imperfect departments (or layers of Swiss cheese) like mission control, computer programmers, engineers building spacecraft, etc. should make it extremely difficult for an error to find its way through the small holes in each department's slice of Swiss. In the case of the space station flipping, an error slipped through many, many layers of international Swiss cheese. "Humans are perfectly fallible, and machines are perfectly fallible."
Generally, the space station is a quiet place, tranquilly orbiting some 250 miles above Earth. It's an afterthought to many of us. But things can go wrong. It's hugely fortunate, for example, that Nauka didn't start misfiring as it was docking, potentially leading to an impact with the space station.
This recent flip wasn't terrible, but it's a poignant warning of our vulnerabilities in the harsh realm of space, even on the dependable space station.
"Space is dangerous," said Crisman. "Humans are perfectly fallible, and machines are perfectly fallible."