International Journal of Performability Engineering, Vol. 9, No. 6, November 2013, pp. 633-640. © RAMS Consultants Printed in India Shuttle Risk Progression – Focus on Historical Risk Increases T. L. HAMLIN* Safety and Mission Assurance, National Aeronautics and Space Administration, Houston, Texas, 77058 (Received on March 22, 2013, revised on March 27, and June11, 2013) Abstract: It is important to future human spaceflight programs, to understand the early mission risk and the impact of design, process, and operational changes on risk. The Shuttle risk progression assessment used the knowledge gained from 30 years of operational flights and the Shuttle Probabilistic Risk Assessment (PRA) to calculate the risk of Shuttle Loss of Crew at significant milestones beginning with the first flight. The results indicated that the Shuttle risk tends to follow a step function as opposed to following a traditional reliability growth pattern. In addition, the results showed that risk can increase due to trading safety margin for increased performance, due to external events or due to intended (disabling ejection seats) or unintended (Space Shuttle Main Engine Block II upgrade) consequences of design changes. This paper will focus on examining those cases where risk increased and explore the lessons that can be learned by new programs. Keywords: NASA, Space Shuttle, Probabilistic Risk Assessment 1.0 Introduction It is important to future human spaceflight programs, to estimate the early mission risk and the impact of design, process, and operational changes on risk. The Shuttle risk progression assessment [1] used the knowledge gained from 30 years of operational flights and the Shuttle Probabilistic Risk Assessment (PRA) to retrospectively estimate the risk of Shuttle Loss of Crew at significant milestones beginning with the first flight. The term “progression” was chosen to signify the movement through time from the Shuttle first flight to the final flight and was not meant to imply progress with respect to risk reduction. The results of the Shuttle risk progression indicated that the Shuttle risk tends to follow a step function in which the risk is constant until changes in the design, process, or operation occur. This is in contrast to the traditional, continuous reliability growth pattern obtained from a purely theoretical prediction. In addition, the results showed that risk can increase due to trading safety margin for increased performance, due to external events or due to intended (disabling ejection seats) or unintended (Space Shuttle Main Engine Block II upgrade) consequences of design changes. This paper will focus on examining those cases where risk increased and explores the lessons that can be learned by new programs from these events. _________________________________ *Corresponding author’s email: [email protected] 633 634 1.1 Teri L. Hamlin Background The Space Shuttle Program (SSP) initiated the development of a Shuttle Probabilistic Risk Assessment (SPRA) in March 2001. The purpose of the SPRA was to provide a useful risk management tool for the SSP to identify strengths and possible weaknesses in the Shuttle design and operation. The SPRA was designed to help answer critical risk-related questions such as: • What is the integrated Space Shuttle risk? • What are the most significant risk drivers? • How significant is uncertainty on critical risk factors? • How robust are current risk controls to mitigate critical risks? • How can the program best target scarce risk mitigation resources? • What is the impact of proposed changes in Shuttle design and operations? The general scope of the SPRA included hazard causes that may result in an inflight Loss of Crew and Vehicle (LOCV). In-flight was defined as the time from launch (T-0) to wheel stop. Rendezvous and docking and extravehicular activity occurred within this time frame. However, these activities were mission-specific at the time and considered outside the scope. The vehicle configuration was assumed to be equivalent to that of a generic Orbiter (i.e., all four Orbiters were assumed the same). The hazards assessed in the SPRA generally consist of: • Equipment functional failures • Flammable/explosive fluid leaks • Structural failures, including hits from ascent debris and micrometeoroid/orbital debris • Human errors Each Shuttle mission was unique, defined by its payload, flight dynamics, duration, etc. However most were beyond the resolution of the SPRA and could be treated on a nominal basis. Some however, such as mission duration, have a direct impact on the SPRA and needed to be specified. The nominal mission duration was assumed to be 306 hours based use of STS-119 as a reference mission. In addition, for the nominal SPRA mission, an ISS mission was assumed. The ISS mission was chosen because the vast majority of the flights fell into this category. Only in-flight end states are considered in the model; therefore, only undetectable failures that could lead to the end states of interest were considered prior to T-0. Conversely, failures occurring in flight that have the potential to cause LOCV after wheel stop were not included. The SPRA was modeled using the Systems Analysis Programs for Hands-on Integrated Reliability Evaluations (SAPHIRE) software. The SPRA model was developed using linked event trees to represent the integration of the vehicle elements, the three mission phases (Ascent, Orbit and Entry), as well as four performance aborts phases. A single entry point of “Launch” was used with fault trees linked into the event tree top events. In 2010 a study of how Shuttle Risk changed over time, known as the Shuttle Risk Progression Study was initiated using the Shuttle PRA. The results were published in 2011[1]. 2.0 Methodology In the Shuttle PRA, failure contributors which have been mitigated through redesign or process improvements are discounted and appropriately reduced in Shuttle Risk Progression – Focus on Historical Risk Increases 635 probability. The retrospective analysis approach is to remove these discounts in order to estimate the risk prior to the improvements. For risk contributions which are more complex and are quantified via Bayesian analysis, such as the contributions from ascent debris or the Reusable Solid Rocket Motor (RSRM), early flight risk estimates are specifically modeled and documented in the Shuttle Risk Progression Report [3]. In order to ensure that the risk differences are not about a particular mission objective, the analysis models the current mission with the vintage vehicle. Since the model is based upon Iteration 3.3 of the SPRA[4], the mission duration and Micro-meteoroid Orbital Debris (MMOD) risk are based upon STS-119. Earlier missions although short in duration were dominated by risks which were independent of mission length (e.g., RSRM, Ascent Debris). No model logic changes were made to the Iteration 3.3 model. Inspection, repair, and crew rescue improvements made after the Columbia accident were not included for the flights prior to the Columbia accident. As previously mentioned this analysis uses the SPRA and therefore is subjected to its limitations. A description of the SPRA limitations can be found in the Iteration 3.0 integration notebook [2]. In addition to the general SPRA limitations, the following limitations are specific to the analysis and results presented in this paper: The analysis is based upon the current understanding of Shuttle risk looking back after 30 years of operating history (and therefore does not address any still unknown risks) covering those risk contributors that were considered. The analysis can be used to inform a new program of the general trend of reliability growth for a complex high risk vehicle but specific values should not be used since a new program will have its own lessons to learn, may be starting at a different point, and operating under different conditions. 3.0 Shuttle Risk Progression Results Figure 1 provides the estimate of the overall Shuttle risk progression. Risk is defined here as the probability of loss of crew. The uncertainties on the estimates are roughly a range factor of 2 with the relative changes in risk having smaller uncertainties because of the positive correlation due to the common baseline risk. As observed, the overall trend in the failure probability is a significant decrease from approximately 1:12 for the first flight to 1:90 at the latest flight. The failure probability also remains approximately constant between changes made to the Shuttle design and/or operation that could impact the failure probability. It is important to note that the failure probability does not monotonically decrease with time/missions, but instead increases at some points. These increases will be focused on since they provide important information and lessons. 636 Teri L. Hamlin 0.12 1:10 1:10 1:10 0.1 1:12 Probability 0.08 1:17 0.06 1:21 1:21 0.04 1:36 1:37 1:38 1:47 0.02 1:47 1:73 0 1 STS-1 5 STS-5 10 15 STS-41B 20 25 30 35 STS-51L, STS-26 and STS-29 40 45 STS-49 50 55 60 65 70 75 80 STS-77 85 90 STS-86 STS-89 95 100 105 110 115 120 STS-103 STS-110 STS-114 Flight Sequence # Figure 1: Shuttle Risk Progression Summary Highlighting Risk Increases The first notable risk increase encircled at the far left occurred when the Shuttle went from a test vehicle to an operational vehicle and the crew size increased from 2 to 4 on STS-5. Due to the increase in crew size the ejections seats, which were provided to the commander and pilot of the test flights, were disabled since no ejection seats were available to the remaining crew. The ejection seats were completely removed from the vehicle prior to its next flight. Although the ejection seats are a relatively effective crew escape system early in ascent (up to ~80K feet) and late in entry, its impact on risk was limited because of the other high risk contributions associated with events that could occur outside that window such as vehicle breakup on entry. At the time the risk associated with a Shuttle mission was not fully appreciated, which is often the case for new vehicles due to the difficulty in quantifying unknown risks. The earliest Shuttle risk estimates were ~1:1000 as shown in Figure 2, and because of the unknown risks the impact of the decision to disable the ejection seats was not fully understood at the time. Following Challenger crew escape systems were evaluated as a potential Shuttle upgrade but were abandoned because of the implementation difficulty as well as the significant cost and schedule impacts. 0.12 1:10 1:10 1:10 0.1 1:12 Probability 0.08 1:17 0.06 1:21 1:21 0.04 Weatherw ax Analysis (1:35) 1:36 0.02 Wiggins Analysis (1:1000 to 1:10000) 0 1 5 1:37 Galileo Study (1:55) 10 15 STS-1 STS-41B STS-5 20 25 30 STS-51L, STS-26 and STS-29 1:38 Updated Galileo Study (1:73) 35 40 45 50 STS-49 55 60 1:47 1995 PRA (1:131) 65 70 75 80 STS-77 Flight Sequence # 1:47 1:73 1998 PRA (1:234) 85 90 95 1:90 Shuttle PRA (1:61 to 1:90) 100 105 110 115 120 125 130 STS-86 STS-103 STS-89 STS-110 STS-114 STS-133 Figure 2: Shuttle Risk Progression Summary with Historical Risk Estimates Since the SPRA model was not set up to include ejection seats, the analysis was completed by reviewing the top 99% of the cut sets and using engineering Shuttle Risk Progression – Focus on Historical Risk Increases 637 judgment to determine whether or not and to what extent ejection seats would be able to mitigate each scenario. Preliminary results were calculated using Excel and then recovery rules were used to post process the cut sets in SAPHIRE in order to calculate a mean with uncertainty. Given a scenario that is assumed to be recoverable, ejection seats are given a 90% success rate (i.e., there is a 10% chance that either crewmember will not survive). The next notable risk increase which is encircled in Figure 1 occurs on STS86. Changes in the External Tank (ET) foam and application process led to a significant number of Orbiter damages which were estimated to result in the risk contribution from critical ascent debris damage to increase. Figure 3 shows the number of Orbiter lower surface damage occurrences in order of the ET start date. From ET-88 (STS-86) to ET-100 (STS-96) there is an increase in the number of damages on average from approximately 13 greater than 1 inch to approximately 45 greater than 1 inch. Black indicates LWT Red indicates SLWT Mission 96, STS-103, ET-101 was the first mission with venting holes on ET TPS Mission 87, 88, STS-86, STS-87,ET-88 ET-89was wasthe thefirst f irst Mission mission with new foam on intertank 100 Mission 87, STS-86, ET-88 was the first mission with new foam on tank's acreage 80 Debris Hits Debris Hits by ET Start Date LWT ET-93 used on STS-107 (Columbia accident) goes here: 60 40 20 0 120 119 118 117 121 116 115 114 113 112 111 110 109 108 107 106 105 92 104 103 102 99 101 98 100 97 96 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 72 74 71 73 70 69 68 67 66 65 64 63 62 ET Number Figure 3: Orbiter Lower Surface Damages Arranged by ET Start Date Tile risk from ascent debris risk is modeled in a completely different way than functional and phenomenological risk since it does not accommodate the traditional failure rate calculation methodology. The modeling of tile risk is based upon historical occurrences of Orbiter lower surface damages greater than 1 inch and uses the JSC S&MA developed Ascent Debris Analysis Model (ADAM)[5]. ADAM uses input distributions derived from historical damages (length, width, depth, quantity, location) and simulates damages in a mission. The simulated mission damage is compared against the Orbiter damage criteria to estimate the probability of the damage being critical and that would cause LOCV on re-entry if not mitigated through repair or crew rescue. The separate estimate of the risk contribution to Reinforced Carbon Carbon (RCC) material used on the Orbiter’s wing leading edges and nose cone is based upon flight history using engineering judgment to adjust the damage with the changing environment. A report documenting lessons learned from the development of the External Tank (ET)[6] provides insight into what occurred from ET-88 (STS-86) to ET100 (STS-96) both to increase risk as well as why it decreased again on STS-103. Preceding these flights, the Environmental Protection Agency (EPA) banned the use of CFC-11 Freon which was used extensively in the ET foam. The new foam along with a new blowing agent was introduced over three tanks starting with STS-85. The first flight had no noticeable problems, but the use of the new foam was limited. STS-86 was the first flight with the new foam on the tank’s acreage with above average damage on the Orbiter as seen in Figure 3. However the next 638 Teri L. Hamlin flight which had the new foam on the ET intertank area had significantly higher damages. Following this flight, a team was established to investigate this problem. As part of the investigation, a thermal/vacuum testing program found that the high vapor pressure of the new blowing agent combined with the lower yield strength of the new foam in conjunction with the propensity for this foam to fail on slip planes parallel to the ET intertank ribs, caused small chunks of foam to come off [6]. This risk contribution from the foam was eventually mitigated by punching vent holes into the foam in areas where there was considered to be a transport mechanism to the Orbiter. This can be seen by the decrease in damages seen on the Orbiter starting at STS-103. Although it was recognized that STS-87 was a significant anomaly at the time and the overall risk to the crew was not well understood, the Shuttle continued to fly while the event was being investigated. In hindsight the certification of the new foam did not involve testing of the material to the complete flight environment which could have identified the issue prior to its first flight, focus was on the critical properties identified in the previous development and test program. The additional risk increases that occurred in the evolution of the Shuttle risk are not visible in Figure 1 since they are offset by decreasing risk contributions in other areas including decreases in the software risk contribution. These risk increases are associated with operational and design changes to the Space Shuttle Main Engines (SSMEs). Figure 4 shows the estimated risk impact of these changes on risk. 1:170 0.006 0.005 1:190 1:210 1: 240 Probability 0.004 1:290 1:290 0.003 1:380 1:380 0.002 1:680 1:610 1:660 0.001 0.000 01 STS-1 5 STS-5 10 15 STS-41B 20 25 30 STS-51L, STS-26 and STS-29 35 40 45 50 STS-49 55 60 65 70 75 80 STS-77 Flight Sequence # 85 90 STS-86 STS-89 95 100 105 110 115 120 125 130 STS-103 STS-110 STS-114 STS-133 Figure 4: SSME (Uncontained) Risk Contribution Evolution Highlighting Risk Increases On STS-6 the SSME operational power was increased in order to increase performance, thus reducing the safety margin and increasing the risk of uncontained engine failure. The increased performance was needed to accommodate additional weight. Since STS-6 was not an analyzed mission, this increase shows up on STS-41B in Figure 4. This increase in risk was estimated by reviewing engine tests and failures that occurred and identifying those failures which could be attributed to operating at greater than 100% power level and extrapolating to the failure probability on flight. Although the engine was Shuttle Risk Progression – Focus on Historical Risk Increases 639 certified to operate at the higher power level, it decreased the safety margin and caused an increased risk contribution. At the time the risk increase was not quantified but in hindsight it resulted in a 20% increase in the probability of having an SSME uncontained engine failure. It is unlikely that this increase in risk would have impacted the decision to increase the operational power level but it would have been beneficial for making a risk-informed decision. When the SSME was upgraded by the introduction of the High Pressure Fuel Turbopump-Alternate Turbopump, there were three early failures during testing. These test failures were mitigated prior to the engines being flown and were therefore discounted in the analysis of the flight engines but risk still increased due to the remaining residual risk. There was a slight increase in risk (~12%) due the addition of a new failure mode that did not previously exist. This demonstrates that vehicle upgrades that are intended to reduce risk associated with known failure modes can in fact introduce new failure modes and increase the overall risk. Eventually with the addition of the Advance Health Monitoring System (AHMS), the SSME uncontained engine risk decreased but still remained a higher risk than before the changes. 4.0 Conclusions Overall, the Shuttle mission risk improved by approximately an order of magnitude over the life of the program. Risk reductions are the result of redesigns or operational changes, the most significant of which followed major events (e.g., Challenger, Columbia, STS-27’s TPS damage). The focus of this paper is on the risk increases that also occurred due to changes to Shuttle. This analysis is thus different than theoretical reliability growth models which predict steady risk reduction and reliability improvement as the system operates and evolves. The analysis that was carried out showed that risk increased due to trading safety margin for increased performance or due to changes that caused external event risk to increase. An example of trading safety margin is the increase in SSME risk with the increase in operating power level. An example of an external event which increased the risk was the changes in the foam insulation used for the ET. An important message to new programs should be that external influences, operational changes and design upgrades can cause risk increases as well as risk decreases in the risk evolution. It may be necessary to reassess whether or not previous testing and analysis is appropriate for the new configuration. In the case of the ET foam, the previous testing and analysis was inadequate to detect the impact of the new foam and blowing agent on debris liberation in flight. Furthermore, it may be necessary to reassess the benefit of a design upgrade if preliminary testing indicates there are new failure modes. References [1]. Hamlin, T., E. Thigpen, J. Kahn, Y. Lo. Shuttle Risk Progression: Use of the Shuttle Probabilistic Risk Assessment (PRA) to Show Reliability Growth. American Institute of Aeronautics and Astronautics Space 2011 Conference held at Long Beach, California on September 27-29, 2011. [2]. Thigpen, Eric. Model Integration Report, Vol. II, Rev. 3.0, NASA, Johnson Space Center, Safety and Mission Assurance Directorate, Shuttle and Exploration Division, Analysis Branch, Houston, Texas, November 2008. 640 Teri L. Hamlin [3]. Hamlin, T., J. Kahn, and Y. Lo. Shuttle Risk Progression by Flight, NASA SSMA-11-001, Rev.1 March 2013. [4]. Thigpen, Eric. Shuttle PRA Iteration 3.3 Changes Notebook, NASA, Johnson Space Center, Safety and Mission Assurance Directorate, Shuttle and Exploration Division, Analysis Branch, Houston, Texas, November 2010 [5]. Vera, J. Ascent Debris Analysis Model (ADAM) User’s Reference Guide, Version 1, NASA, Johnson Space Center, Safety and Mission Assurance Directorate, Shuttle Exploration Division, Analysis Branch, Houston, Texas, June 16, 2010. [6]. Pessin, M. Lessons Learned From Space Shuttle External Tank Development – A Technical History of the External Tank, October 30, 2002. Teri Hamlin has a B.S. in Nuclear/Mechanical Engineering from Worcester Polytechnic Institute. Teri worked at Northeast Utilities for eight years performing PRA activities for their three Millstone Nuclear Power Plants. In 2002, Teri entered the aerospace industry as a PRA analyst for SAIC. At SAIC, she served as the lead for the Shuttle Human Reliability Analysis (HRA), which represents the most comprehensive Shuttle HRA to date. In 2006, Teri joined the JSC S&MA Analysis Branch as the Shuttle PRA Lead. She remained Shuttle PRA Lead until the Shuttle retirement following STS-135 in July 2011. She is currently the Commercial Crew Probabilistic Safety Analysis (PSA) lead, assisting in the development of commercial crew requirements and PSA methodology development. She is also responsible for providing insight into the commercial providers Loss of Crew and Loss of Mission assessments.
© Copyright 2026 Paperzz