High Reliability Organizations and Complex Adaptive Systems Can

High Reliability Organizations and Complex Adaptive
Systems
Can resilience be engineered?
David Woods
Director
Complexity in Natural, Social & Engineered Systems
The Ohio State University
&
Cognitive Systems Engineering Laboratory (C/S/E/L)
Human Systems Integration
Dept. of Integrated Systems Eng.
The Ohio State University
&
President, Resilience Engineering Association (REA)
C/S/E/L :2011
Wednesday, November 9, 2011
Releasing the Adaptive Power of Human Systems
Can Resilience Be Engineered?
Complex Adaptive Systems /
Resilience Engineering
•
new sciences (measures, models, findings) on
how adaptive system work and breakdown
•
new techniques to
~ enhance resilience in face of potential
surprise
~ monitor for increasingly brittle systems
~ assess how change expands or constricts
adaptive capacities
C/S/E/L :2011
Wednesday, November 9, 2011
Progress in Adaptive System Sciences
Characteristics of High Reliability Organizations
• Preoccupation
with failure
• Reluctance
to simplify
• Sensitivity
to operations
• Commitment
• Deference
C/S/E/L :2011
Wednesday, November 9, 2011
to resilience
to expertise
HRO and Resilience
Images of Resilience, of Brittleness
through an aggressive and innovative programme
of cost cutting on its P36 production facility.
Wednesday, November 9, 2011
Walls of Complexity
Wednesday, November 9, 2011
NASA failure history captures creeping complexity
1999: 3 space
exploration failures
2003: Run up to
Columbia accident
increasingly brittle systems
under
faster, better, cheaper (FBC)
pressure
C/S/E/L :2011
Wednesday, November 9, 2011
Mis-Managed Fundamental Tradeoffs
Complex Adaptive Systems under Faster, Better Cheaper Pressure
NASA failure history: cumulative complexity circa 2000
C/S/E/L :2011
Wednesday, November 9, 2011
Faster, Better, Cheaper Pressure
Progression of concepts about ‘Resilience’
• rebound
from traumatic event
• expand
base adaptive capacity to
handle more disruptions
• brittleness
Complex
Adaptive
Systems
C/S/E/L :2011
Wednesday, November 9, 2011
versus graceful degradation:
bring ‘extra’ adaptive capacity to bear in
the face of potential for surprise
• manage/regulate
adaptive capacities:
governance and architectures that tend
to find hard limits in tradeoff spaces
Histories of ‘Resilience’
First Principles:
• distinguish
first and second order adaptive capacity
~ potential for action in the future when conditions change or
new events challenge old models, ...
• optimality
- brittleness and other tradeoffs are
unavoidable (Doyle)
~ Even if the world were perfect, it wouldn’t be. Yogi Berra
~ Anomalies are what happens when something else was planned;
whatever the plan, something else always happens.
~ Adaptive behavior consumes success.
• cross-level
interactions
~ polycentric (Ostrom)
•
multiple perspectives
~ reflective
~ calibration
C/S/E/L :2011
Wednesday, November 9, 2011
Fundamentals of Complex Adaptive Systems
Patterns of Adaptive Breakdown
Decompensation: exhausting capacity to adapt as
disturbances/challenges cascade.
breakdown occurs when challenges grow and cascade faster than
responses can be decided on and deployed to effect.
Working at cross-purposes: behavior that is locally
adaptive, but globally maladaptive
inability to coordinate different groups at different echelons as goals
conflict.
Getting stuck in outdated behaviors: the world changes
but the system remains stuck in what were previously
adaptive strategies.
C/S/E/L :2011
Wednesday, November 9, 2011
Patterns of Adaptive Breakdown
Crisis Management
eg
Urban Firefighting
~ distributed roles
~ multiple echelons
~ disrupting factors
~ multiple goals
~ interdependencies
~ all responsible in part
C/S/E/L :2011
Wednesday, November 9, 2011
Maladaptive Patterns and Critical Incidents in Urban Firefighting
(Branlat et al., 2009)
Decompensation
• If
request resources when need is definitive, it is already too late
• Regulate additional adaptive capacity (tactical reserves)
~ maintain margins of maneuver (ability to handle next surprise)
~ “avoid all hands situations” (incident command)
• Bumpy transfers of control
Working at cross-purposes (both horizontal and vertical)
• Actions
of one group increase threats to other groups (opposing
fire hoses; rendering escape routes or protected areas unaccessible)
• Failure to resynchronize
• Goal priorities/conflicts in response to distressed firefighter
Getting stuck in outdated behaviors
• Failures
to modify plan in progress as situation changes
C/S/E/L :2011
Wednesday, November 9, 2011
Patterns in Urban Firefighting
Patterns of Adaptive Breakdown
Complexities in time --> Decompensation: exhausting
capacity to adapt as disturbances/challenges cascade.
Complexities over scales --> Working at cross-purposes:
behavior that is locally adaptive, but globally maladaptive
Complexities in learning --> Getting stuck in outdated
behaviors
C/S/E/L :2011
Wednesday, November 9, 2011
Patterns of Adaptive Breakdown
Decompensation
~ exhausting capacity to adapt as
challenges cascade
• Starling curve cardiology
• cardiovascular anesthesiology
• asymmetric lift, aviation automation,
bumpy transfer of control
• ‘surge’ capacity in ER
• ICU bed crunches
• Tempo of operations
Working at cross-purposes
~ inability to coordinate different
groups at different echelons as
goals interact and could conflict.
• Tragedy of the commons
• Fragmentation (stuck in silos)
• Missing side effects of change
• Failure to resynchronize
• Double Binds
Getting stuck
~ the world changes but the system remains stuck in what
were previously adaptive strategies.
• Oversimplifications
• Fixation
• Failing to revise plan in progress
• Discount discrepant evidence (e.g., run up to Columbia)
• Distancing through differencing
C/S/E/L :2011
Wednesday, November 9, 2011
Sub-patterns of Adaptive Breakdown
easy to see a person or a piece of technology
easy to see components, rules; easy to see things
harder to see interactions, coordination, synchronization
harder yet to see adaptation, complexity, brittleness,
resilience
easy to mistakenly––
see erratic human behavior,
regulate components
juxtapose people versus machines,
when the dynamics of complex adaptive systems are
the underlying drivers
C/S/E/L :2011
Wednesday, November 9, 2011
Complex Adaptive Systems
A. Adaptive capacity exists before disrupting events
call upon that capacity
(it is a potential for future adaptive action)
B. One assesses (observes/models/measures)
adaptive capacity through its exercise in the
anticipation and reaction to past disruptions.
(A) means that the resources that support the
potential, prior to visible disrupting events, may not
be seen at all since they are not used; or if seen, they
will be seen as excess capacity since it is not in use.
Under pressures, systems tend to lose Margins of
Maneuver
Monitor/Regulate Margin of Maneuver (MoM)
C/S/E/L :2011
Wednesday, November 9, 2011
Potential for Future Adaptive Action
Monitor/Regulate Margin of Maneuver:
Cushion of potential actions and additional resources that allows the
system to continue functioning despite unexpected demands.
How much active control margin or capability is left to handle the next
event or disturbance?
Each center of adaptive behavior works to create, maintain, and manage
their margin of maneuver.
Locally adaptive as one center manages its margin relative to
interdependencies with other centers’ behaviors to manage their margin
Failure to maintain margin leaves the system too brittle and increases the
risk of falling into the maladaptive traps (eg, locally adaptive, globally
maladaptive)
Resilient systems are able to anticipate how margins of maneuver are
expanding or contracting relative to the potential for surprise.
C/S/E/L :2011
Wednesday, November 9, 2011
Margin of Maneuver
despite tradeoffs, effective Architectures for Resilient
Control have the ability to generate, mobilize, and deploy
extra adaptive capacity to
manage complexities in time -->
ability to maintain and regulate their MoM given
challenges ahead
manage complexities over scales -->
units adjust based on impact on others MoM; recognize
adaptive shortfalls ahead
manage complexities in learning -->
learn about how extra adaptive capacity is brought to bear
to handle challenge events at the boundaries
C/S/E/L :2011
Wednesday, November 9, 2011
Architectures
Maladapted Architectures
today: people, when responsible agents in systems,
serve as a general purpose adaptive capacity to compensate for
adaptive shortfalls
• intense
struggle to maintain and regulate adequate MoM
high risk of decompensation
• units
regularly but inadvertently squeeze other units’ MoM
high risk for locally adaptive, globally maladaptive interactions
• miss
opportunities for learning How Extra Adaptive
Capacity is or needs to be Brought to Bear
~ overconfident and miscalibrated
~ high risk of getting stuck in stale approaches
• see
unintended consequences
C/S/E/L :2011
Wednesday, November 9, 2011
Maladapted Architectures
Stress response patterns for Maladapted Architectures
1. under charged organization:
ongoing struggle to be able to generate, mobilize, and
deploy extra adaptive capacity wears down units
direct experience of the risk of 3 patterns of adaptive
breakdown
2. over charged organization:
due to frequent heightened readiness to respond to
possible challenge and surprise
analogy to diseases of the immune system
C/S/E/L :2011
Wednesday, November 9, 2011
Responses to Risk of Adaptive Breakdowns
Stress response patterns for Maladapted Architectures
re-define ‘rebound’:
• people as reflective adaptive systems
• people feel the risk of adaptive breakdowns as a kind of stress
how do people react over long term to these risks when the
organization is (a) under- or (b) over- charged as a CAS system?
1. their experience of organization’s ability to readjust
dynamically: ability of organization to adapt how it adapts (move
in tradeoff spaces)
2. their experience of participation in that process (see HRO
principles and principles for polycentric governance eg,
reciprocity and initiative)
C/S/E/L :2011
Wednesday, November 9, 2011
Responses to Risk of Adaptive Breakdowns
Major challenges for research agenda on Resilience:
•
studying CASs requires a CAS;
need partner -- an organization open to question and learn
•
security (and other issues) undermines studying a human CAS
(see HRO principles)
•
requires new methods and multiple methods: studying
reverberations of change
•
developing new multi-agent simulation tools is essential
ingredient
C/S/E/L :2011
Wednesday, November 9, 2011
Research challenges to study/model CAS