Chapter X Measures of Resilient Performance

Chapter X
Measures of Resilient Performance
David Mendonça
Introduction
In order to theorize, manage – even engineer – resilience, it is necessary
that the factors that contribute to resilience be identified, and that
measures of these factors be validated and exercised. Yet to date there
have been few systematic attempts to create such measures.
Complicating the matter is the fact that the resilience of safety-critical
systems may only be manifested during actual operations. As a result,
opportunities for controlled study (even systematic observations) on
resilient organizations are severely limited. There is therefore a clear
need to identify the factors that contribute to resilience, develop
measures for these factors, and validate instruments for estimating the
values of these factors.
This chapter takes as a starting point a set of factors developed in prior
research on organizational resilience. It then discusses an approach to
refining and measuring these factors. The framework is then applied to
the development and assessment of a candidate set of measures for the
factors, using data drawn from observation of infrastructure restoration
in New York City, New York, following the 11 September 2001 World
Trade Center attack.
Defining and Measuring Resilience
Among the definitions of resilience are an ability to resist disorder
(Fiksel, 2003), as well as an ability to retain control, to continue and to
rebuild (Hollnagel & Woods, 2006). Indeed, despite its relevance to the
maintenance and restoration of system safety and operability, resilience
may be a difficult concept to measure. For example, during system
operation it may be possible only to measure its potential for resilience,
rather than its resilience per se (Woods, 2006). The following factors
are thought to contribute to resilience (Woods, 2006):
•
buffering capacity: size or kind of disruption that can be
absorbed/adapted to without fundamental breakdown in system
performance/structure
•
flexibility/stiffness: system’s ability to restructure itself in
response to external changes/pressure
•
margin: performance relative to some boundary
•
tolerance: behavior in proximity to some boundary
•
cross-scale interactions: how context leads to (local) problem
solving; how local adaptations can influence strategic
goals/interactions
Resilience engineering is “concerned with monitoring and managing
performance at the boundaries of competence under changing
demands” (Hollnagel & Woods, 2006). In seeking to engineer
resilience, it is therefore appropriate to consider how these factors may
be measured.
Resilient performance (or the lack thereof) may arise out of need or
opportunity, though the latter case is very rarely studied. In the former
case, there are numerous studies of how organizations have dealt with
situations that push them to the boundaries of competence. Disaster or
extreme event situations combine many elements that—by definition—
challenge capabilities for planning and response .
Opportunities for examining resilient performance in response to
extreme events are limited. First, there may be high costs associated
with large-scale and nearly continuous observation of pre-event
conditions. Second, the consequence of extreme events can include
destruction of established data collection instruments, as occurred with
the emergency operations center and, later, at the New York Fire
Department command post as a consequence of the World Trade
Center attack. Third, new processes, technologies and personnel
brought in to aid the response may not be measurable with any
instruments that remain available, as commonly occurs when the
victims of an extreme event act as first responders, or when ad hoc
communication networks are formed. A very real challenge in
engineering resilience is therefore fundamentally methodological: how
can organizational theorists and designers develop and implement
measurement instruments for “experiments” which are essentially
undesignable?
Broadly speaking, measurement may be defined as the “process of
linking abstract concepts to empirical indicants” (Carmines & Zeller,
1979). It is worth emphasizing that this definition does not
presupposed that measures are quantitative, merely that the linkage
between abstract concepts and their instantiation in the real world be
provided empirically. Intimately bound in any discussion of
measurement in science and engineering are the notions of reliability
and validity. Reliability refers to “the tendency toward consistency
found in repeated measures of the same phenomenon” (Carmines &
Zeller, 1979). In other words, a measurement instrument is reliable to
the extent that it provides the same value when applied to the same
phenomenon. On the other hand, an indicator of some abstract
concept is valid to the extent that it measures what it purports to
measure (Carmines & Zeller, 1979). In other words, a valid
measurement is one that is capable of accessing a phenomenon and
placing its value along some scale. Two types of validity are commonly
investigated. Content validity “depends on the extent to which an
empirical measurement reflects a specific domain of content. For
example, a test in arithmetical operations would not be content valid if
the test problems focused only on addition, thus neglecting subtraction,
multiplication and division” (Carmines & Zeller, 1979). Construct
validity is “the extent to which an operationalization measures the
concepts it purports to measure” (Boudreau, Gefen, & Straub, 2001).
More precisely, construct validity “is concerned with the extent to
which a particular measure relates to other measures, consistent with
theoretically-derived hypotheses concerning the concepts (or
constructs) that are being measured” (Carmines & Zeller, 1979).
Construct validation involves determining the theoretical relation
between the concepts themselves, examining the empirical relationship
between the measures and the concepts, then interpreting the empirical
evidence to determine the extent of construct validity.
With a few prominent exceptions, the path to instrument development
is seldom discussed, thus providing little insight for researchers and
practitioners about the validity and reliability of the measurements
produced by these instruments. A common exception is the survey
instrument, which is often to access attitudes and other psychological
states that might be difficult to measure directly. Yet for quite some
time, a number of researchers have argued for increasing the portfolio
of measures used in social science. For example, numerous
unobtrusively collected measures may be of use in advancing
understanding of organizations (e.g., Weick, 1985). In the early days of
the development of theory for a new (or newly discovered) class of
phenomena, the need for discussions of instrument development is
particularly great. Without adequate attention to the assumptions
underlying instrument development, the theory may become too
narrow (or too diffuse) too quickly, leading either to an unnecessarily
narrow view or to a hopelessly broad one.
Resilience engineering is clearly a field in the midst of defining itself and
its relationship to other fields, and this includes identifying and defining
the phenomena which researchers in the field intend to investigate.
Research in resilience engineering has been predominantly informed by
field observations and not, for example, by laboratory studies. More to
the point, research in the field has been strongly interpretive, focusing
primarily on case studies. A study may be said to be interpretive “if it is
assumed that our knowledge of reality is gained only through social
constructions such as language, consciousness, shared meanings,
documents, tools and other artifacts” (Klein & Myers, 1999). The types
of generalizations that may be drawn from interpretive case studies are
the development of concepts, the generation of theory, drawing of
specific implications and the contribution of rich insights (Walsham,
1995). Principles for evaluating interpretive case studies may be used
for establishing their reliability and validity, though the methods for
doing so differ from those used in positivistic studies.
Klein and Meyers (1999) provide a set of principles for conducting
interpretive field studies, as follows. The fundamental principle—that
of the hermeneutic circle—suggests that “we come to understand a
complex whole from preconceptions about the meanings of its arts and
their relationships.” Other principles emphasize the need to reflect
critically on the social and historical background of the research setting
(contextualization) and how research materials were socially
constructed through interaction between researchers and participants
(interaction). The principle of dialogical reasoning requires sensitivity to
possible contradictions between theory and findings. Similarly, the
principle of multiple interpretations requires sensitivity to differences in
participants views, while the principle of suspicion requires sensitivity
to possible biases and distortions in those views. Application of the
principles of the hermeneutic circle and contextualization yield
interpretations of data collected in the field. The principle of
abstraction and generalization requires relating these interpretation to
theoretical, general concepts concerning human understanding and
social action.
In contrast to interpretive studies are positivist studies. A research study
may be said to be positivist “if there is evidence of formal propositions,
quantifiable measures of variables, hypothesis testing, and the drawing
of inferences about a phenomenon from a representative sample to a
stated population” (Orlikowski & Baroudi, 1991). There are some
obvious challenges associating with a positivist approach to research in
resilience engineering at this stage. For example, there are still very
many contrasting definitions of resilience itself, as well as of the factors
that are associated with it.
Combining interpretive and positivist approaches seems a reasonable
way to make progress in developing this new area of research, but few
studies—at least in the social sciences—seek to do so, and indeed there
are very few guidelines to lead the way. One approach is triangulation
which may be defined as “the combination of methodologies in the
study of the same phenomenon” (Denzin, 1978). There are two main
approaches to triangulation: between (or across) methods, and within
method (Denzin, 1978). Within-method triangulation “essentially
involves cross-checking for internal consistency or reliability while
‘between-method’ triangulation tests the degree of external validity”
(Jick, 1979). Triangulation can provide “a more complete, holistic, and
contextual portrayal of the units under study,” though it is important to
keep in mind that “effectiveness of triangulation rests on the premise
that the weaknesses in each single method will be compensated by the
counter-balancing strengths of another” (Jick, 1979). The remainder of
this paper discusses a combined interpretive and positive approach to
the measurement of factors associated with resilience, with a particular
focus on the use of triangulation for improving measurement reliability
and validity.
Identifying and Measuring Factors Affecting Resilience
in Extreme Events
Extreme events may be regarded as events which are rare, uncertainty
and potentially high and broad consequences (Stewart & Bostrom,
2002). There are some immediately obvious reasons to study resilience
in the context of the response to extreme events. Performance of
organizations in situations is often at the boundary of their experience.
It is conducted by skilled individuals and organizations, who must make
high-stakes decisions under time constraint (Mendonça & Wallace,
2007a). On the other hand, the boundaries of experience may be
difficult to identify a priori (i.e., before the event has occurred) and
perhaps even afterwards. It is very likely that unskilled individuals and
organizations will participate in the response. The decisions taken
during the response may be very difficult to evaluate, even after the
event. Finally, the long lag times between events—coupled with the
difficulties involved in predicting the location of events—can make preevent monitoring impractical and perhaps impossible.
When a disaster is sufficiently consequential (e.g., Category IV or V
hurricanes, so-called strong earthquakes), public institutions may
provide essentially unlimited buffering capacity in the form of
personnel, supplies or cost coverage. On the other hand, non-extreme
events that nonetheless test organizational resilience (i.e., those typically
called crises) require that this buffering capacity reside within the
impacted organization. In the extreme event situation, then, buffering
capacity is essentially unlimited. The remainder of this section therefore
offers preliminary thoughts on the measurement of margin, tolerance,
and flexibility/stiffness (cross-scale interactions will be discussed briefly
in the context of flexibility/stiffness).
Margin
System boundaries may be said to represent both limits of performance
(e.g., person-hours available for assignment to a task within the system)
and the borders that separate one organization from the outside world
(e.g., entry and exit points for the products associated with the system).
For all but the simplest systems, multiple boundaries will be present for
both types, requiring organizations to reckon their performance along
multiple (sometimes conflicting) dimensions. Measuring the margin of a
system, then, requires an approach that acknowledges these dimensions,
along with possible trade-offs among them. Given the nature of
extreme events, as well as their ability to impact system performance,
the dimensionality of this assessment problem poses considerable
challenge to the measurement of margin.
Tolerance
Like margin, tolerance refers to boundary conditions of the system. In
this case, however, the concept describes not the performance of the
system but rather how that performance is achieved: that is, how the
people, technologies and processes of the system function. In
measuring margin, a chief problem is paucity of data; in measuring
tolerance, the challenge is to develop process-level descriptions of
organizational behavior. For example, this might entail pre- and postevent comparisons of communication and decision making processes at
the individual, group and organizational levels. Given the rarity of
extreme events, cross-organizational comparisons may not be valid
beyond a very limited number of organizations.
Flexibility/Stiffness
The literature on organized response to disaster has shown the
importance of planning (Drabek, 1985; Perry, 1991) to organizational
capacity to respond to extreme events, but it has also shown that
flexibility and an ability to improvise remain crucial in mitigating losses
during response (Kreps, 1991; Turner, 1995). Indeed, the literature on
emergency response is replete with examples of how response
personnel have improvised social interaction (Kreps & Bosworth,
1993), behavior (Webb & Chevreau, 2006)and cognition (Vidaillet,
2001; Mendonça & Wallace, 2003; Mendonça, 2007) in seeking to meet
response goals. Yet the measurement of flexibility and improvisation
has been concentrated on product-related constructs, such as the
perceived degree of effectiveness and creativity in the response. Only
recently have there been attempts to develop process-related measures,
and these are limited to cognitive and behavioral constructs. The final
factor thought to contribute to resilience is cross-scale interactions,
which relates closely to decision making and communication strategies,
and therefore to the cognitive processes that underlie these strategies.
Cross-scale interactions generally refer to within-organization
interactions, though it may be possible that cross-organization
interactions are also relevant to resilient performance. Both of these
latter factors are complementary. Flexibility/stiffness refers to
organizational restructuring, while cross-scale interactions may perhaps
be seen as a particular type of organizational restructuring, one in which
new processes emerge during response, perhaps as part of
organizational restructuring. Consequently, cross-scale interactions will
be discussed in the context of flexibility/stiffness.
Resilient Performance in Practice
As suggested above, development of concepts concerning the factors
that contribute to resilience has progressed to the point where it is now
appropriate to consider how these factors may be measured. This
section reports on the development of measures for margin, tolerance
and flexibility/stiffness as manifested during the response to the 2001
attack on the World Trade Center (WTC). As a result of the attack,
there were extensive disruptions to critical infrastructure systems in
New York City, leading to local, national and international impacts.
Some disruptions were isolated to single systems, while others cascaded
across systems, clearly demonstrating interdependencies that existed
either by design (e.g., power needed to run subway system controls) or
that emerged during and after the event itself (e.g., conflicting demands
for common resources) (Mendonça & Wallace, 2007b).
A number of studies have detailed the impact of the attack on critical
infrastructure systems (O'Rourke, Lembo, & Nozick, 2003), as well as
described some of the restoration activities of subsequent months.
Damage to the electric power system was considerable, certainly
beyond what had been experienced in prior events. This included the
loss of 400 mega-watts (MW) of capacity from two substations which
were destroyed following the collapse of World Trade Center building
7, and severe damage to five of the feeders that distributed power to
the power networks. Indeed, five of the eight total electric power
distribution networks in Manhattan were left without power. In total,
about 13,000 customers were left without power as a result of this
damage. Restoration of this power was an immediate high priority for
the city and, in the case of the New York Stock Exchange, the nation.
Within the telecommunications infrastructure, the loss of power
impacted a major switching station, backup emergency 911 call routing
and consumer telephone service, all located within the building housing
the center. The task of the company was to restore power to the
building and recommence telecommunications services as quickly as
possible.
Taken together, these studies provide a means for understanding the
link from initiating incidents (e.g., power outages), to disruptions (e.g.,
loss of subway service due to lack of power for signaling devices) and
finally to restoration (e.g., the use of trailer-mounted generators for
providing power to individual subway stations). The human side of
both impacts and restoration, on the other hand, has not been nearly as
well explored. Since resilience encompasses both human and
technological factors, it is appropriate to consider how measures for
both sets of factors may be defined and estimated in order to clarify the
concept of resilience.
Method
Data collection activities associated with both studies may be
characterized as initially opportunistic, followed by stages of focused
attention to salient sources. An initial concern was simply how to gain
access to these organizations. Existing contacts within both industries,
combined with the support of the National Science Foundation, were
instrumental in providing initial entree. The briefing for the project was
to study organized response in the restoration of interdependent critical
infrastructure systems. At the time the data were being collected
(beginning in late 2001), few studies had addressed the role of the
human managers of these systems, instead concentrating on technical
considerations of design and management. There were therefore few
exemplar studies – and very little direct methodological guidance – on
how to proceed in the study. Both studies therefore adopted a strongly
interpretive approach to the evolving design of the study.
Initial consultations with the companies responsible for the power and
telecommunications infrastructures being studied were done in order to
identify critical incidents, particularly those which involved highly nonroutine responses. Direct consultations were held with upper
management-level personnel, who then contacted individuals involved
with the candidate incidents in order to assess whether they would be
able (or available) to take part in the study. This occasionally led to
additional, clarifying discussions with management, usually to
investigate expanding the respondent pool. A considerable amount of
time went into developing a respondent pool that spanned those levels
in the organization that were involved in the incident. For example,
study participants ranged from senior vice presidents to line workers
(e.g., those who conducted the physical work of repairing the
infrastructures). For the power company, the initial consultations led to
a set of eight incidents. For the telecommunications company, various
incidents were discussed, but only one could be investigated given the
time commitments of interview subjects, many of whom were still
deeply involved in other restoration activities.
In hindsight, timely data collection was paramount to the success of
both studies. From a practical perspective, data collected so soon after
the fact were fresh – something particularly desirable for data drawn
from human subjects. It also provided an opportunity for the study
team to demonstrate that it could collect data without causing
unreasonable perturbations in the work patterns of study participants.
Data collection methods reflected both the goals of the project and the
perspectives of the four investigators, two of whom were involved in
the study of human-machine systems, and two of whom were involved
in the technical design aspects of infrastructure systems. Discussions
amongst the investigators produced agreement on the salience of core
concepts from “systems engineering” (e.g., component and system
reliability, time to restoration) as well as human psychology (e.g.,
planning, decision making, feedback) to the study. Given the range of
core concepts, the points of contact at the companies were asked to
request study participants to come prepared to discuss the incident, and
to bring with them any necessary supplementary materials (e.g., maps,
drawings). Suggestions on supplementary materials to bring were
sometimes made by the points of contact and the investigators. A
detailed protocol for the interviews was provided to these points of
contact for review and comment.
With a few exceptions, the Critical Decision Method (Flanagan, 1954;
Klein, Calderwood, & MacGregor, 1989) was used for all interviews,
with two interviewers and one or two respondents. One interviewer
asked the probe questions (Klein et al., 1989); a second took notes
(with two exceptions, it was not possible to audio- or video-record the
interviews). The critical decision method (CDM) is a modified version
of the critical incident technique (Flanagan, 1954) and, like other
cognitive task analysis methods, is intended to reveal information about
human knowledge and thinking processes during decision making,
particularly during non-routine decision making (Klein et al., 1989). It
has been used in a wide variety of studies (see (Hoffman, Crandall, &
Shadbolt, 1998) for a review). The five stages of the procedure were
completed in all interviews (i.e., incident identification and selection;
incident recall; incident retelling; time line verification and decision
point identification; progressive deepening and the story behind the
story). However, not all interviews were equally detailed. In practice—
and following guidance in the use of this method—the choice of probe
questions asked of respondents was determined mainly by study
objectives, but also by exigency. For example, all respondents were
asked whether the incident fit a standard or typical scenario, since the
study was strongly informed by work on organizational improvisation,
and plans may be highly relevant as referents for improvised action. On
the other hand, probe questions concerning mental modeling were
never asked, since the investigators had decided early on that formal
modeling of the reasoning processes of respondents would not be
feasible for the project. At other times, respondents simply did not have
the time to commit to a full-scale interview. Most respondents appeared
highly cooperative.
The investigators emphasized throughout their discussions with points
of contact and interview participants that logs of system behavior were
vital to the study design, since these provided the closest possible
approximation of the behavior of technical systems during response
and recovery activities. Materials brought to interviews included system
maps, engineering drawings, photos, field notes and meeting minutes.
These materials were sometimes extensively used. In fact, interviews
which did not include these materials tended to be less illuminating
than those where they were used. When these materials were present, it
was far easier to keep the interviews grounded in the lived experiences
of participants. Finally, it should be noted that other logs were collected
with the help of the points of contact. These were reviewed with
company personnel for completeness and accuracy, and any identified
deficiencies were noted.
At the conclusion of each interview, participants filled out a brief
questionnaire on their background and experience. A second
questionnaire, adapted from work by Moorman and colleagues on
improvisation by organizations (Moorman & Miner, 1998; Miner,
Bassoff, & Moorman, 2001), was used to measure organizational
improvisation, organizational memory and the evaluation of the
response. Finally, it should be noted that a small number of
supplementary materials, such as newspaper and other reports from the
popular press, were sometimes used by the investigators to provide
context on the activities of the two companies. The distribution of
these different types of data across the two studies is given in Table 1.
Results
Data collection activities took place beginning in late 2001 and
continued throughout 2002. In total, eleven in-dept interviews were
conducted (10 for electric power, one for telecommunications), along
with approximately 20 shorter sessions with approximately one
interview subject per session. Other data sources are described below.
Infrastructure
Organizational Units of Interview
Participants
Data Sources
Electric Power
engineering
emergency management
electric operations
energy services
distribution engineering
interviews
log data
meeting notes
after-action reports
photographs
drawings
questionnaires
Telecommunications network operations
interview
questionnaire
after-action reports
photographs
Table 1: Summary of organizational units and data sources in the
studies
Overview of Restoration Activities
The power company engaged in two inter-related strategies for
restoring that power: connecting trailer-mounted portable generators to
provide spot power; and installing temporary feeder lines – called
shunts – to connect live networks to dead ones. The
telecommunications company also relied upon trailer-mounted portable
generators. An overview of these three critical decisions is presented
before illustrating the development and implementation of measures of
margin, tolerance and flexibility/stiffness.
For the power company, the loss of distribution capacity was far
beyond the scale of previous incidents. Soon after the attack, the
company began attempting to procure trailer-mounted generators in
order to provide spot power to critical customers. By 12 September, it
was clear that the amount of time and effort required to secure, install
and operate these generators would be considerable. As a result, the
company decided to create a Generator Group, comprised of
individuals from various parts of the organization, which would have
primary responsibility for work in this area. The second part of the
company’s strategy was the use of shunts – cables with 13 kilovolt (kv)
capacity – which were used to make connections between dead
networks and live ones. This task was handled by existing units in the
organization (such as Distribution Engineering and Electric
Operations). Procedures executed by these units included determining
shunt routes through the city and coordinating pick-ups (i.e., the actual
connecting of the shunts to the networks).
For the telecommunications company, the loss of power to the building
would have triggered a standard operating procedure to connect a
generator to the building via hookups in the basement. However, water
and debris in the basement made this procedure unexecutable. A
decision was then made to connect cable from the diesel generators
directly to the floors which they were to power, but, according to
interview respondent, “there’s no good way of doing that, because it’s
all hard wired in an elaborate system of switches.” The solution
required cutting riser cables above the basement and attaching them to
portable generators with between one and 2.5 megawatt capacity.
(Risers are cables within the building that are normally used to transmit
power throughout the building.) The task of connecting the cables
required considerable care in order to ensure that cables were properly
matched. Generators were running by Friday, 14 September. A gradual
transition was then made to commercial power, essentially resolving the
incident (though generators remained on stand-by and were periodically
tested). (See Mendonça, 2007 for a complete discussion of the case.)
Measuring Resilience
Examining the initial definitions for margin and tolerance, it is clear that –
in order to estimate these factors – it is necessary to identify system
boundaries. In the electric power study, study participants and members
of the research team offered numerous suggestions to comprise a set of
boundaries. For example, staff utilization represents the extent to which
personnel in an organization are utilized. Proxy measures for this
construct had been monitored (e.g., sign-in sheets for on-duty
employees), but ultimately could not be made available to the research
team for reasons of employee confidentiality. Other examples include
transmission system capacity, which represents the amount of power
that could be delivered over existing infrastructure. A complete picture
of transmission system capacity was not available, however. It was,
however, possible to estimate the incremental contributions made to
transmission capacity via the installation of generators shunts. A more
sophisticated measured might combine capacity measures with
estimates of anticipated load from customers.
In the telecommunication study, system boundaries were considerably
more difficult to discern. The main reason was the reliance of the study
on a limited range of participants, but also on the highly localized
nature of the incident: the case study concerned the restoration of
power to a single building. It should also be mentioned that restoration
activities were still being conducted during site visits by the research
team, and therefore there were limits on the amount of time that
participants could devote to supporting data collection. Resource
utilization was discussed in terms of managing demand, since there was
sufficient slack in the system to allow services that were normally
provided through the facility to be provided through other facilities.
The amount of load on the network was also discussed in this context.
Factor
Power
Telecommunications
Margin/
Tolerance
Transmission capacity
Network stability
Network load
Resource utilization
Resource utilization
network load
Flexibility/Stiffness
Restructuring of organizational units
Development of new procedures
Development of new
procedures
Recognition of unplanned-for
contingencies
Identification of opportunities for
renewal
Table 2: Candidate measures for factors contributing to resilience
So, as with other extreme events, both margin and tolerance are difficult
to evaluate since organizational boundaries are difficult to identify. In
the power restoration case, a key observation is that the magnitude of
the restoration problem far exceeded that of previous experience.
Indeed, while generators had been part of previous restorations, the
company had never before needed this quantity in such a short time.
Using the available data for the generator strategy, it does appear that
the path to restoration – as indicated by the cumulative number of
generators restored to the network – followed an S-shape, similar to
that of a prototypical learning curve. In the case of the shunt strategy,
the number of feeder connections made per day suggests a straight path
to achieving sufficient interim capacity.
The nature of flexibility/stiffness in the power restoration case is
suggested by its decision to create a new organizational structure—the
Generator Group—almost immediately after the attack in order to
manage generator procurement and use. The group was dissolved once
the generators ceased to be a crucial part of the restoration plan. In
other interviews (not discussed here), respondents stated that some
existing organizational units improvised their roles, undertaking tasks
that were within the capability of the organization but which were not
in the usual range of activities for the units themselves. This
phenomenon has been amply demonstrated in the response to many
other events (Webb, 2004). Flexibility/stiffness is also reflected in the
major restructuring of physical network, resulting in a new design for
the distribution system (i.e., one using three larger networks instead of
eight smaller ones). In contrast to the generator situation, there was no
major restructuring of organizational units.
In the case of telecommunications restoration, evidence of flexibility is
found in the development of new procedures. For example, during the
interview the manager emphasized the limited usefulness of plans
during the response. He stated “If I’d had to go to anything other than
my head or someone else’s it wouldn’t have worked. You don’t pull a
binder off the shelf on this one. You certainly wouldn’t grab a laptop
and go into something.” Indeed, “no one to my knowledge when into a
system quote unquote that gave them an answer in terms of what to
do.” Yet on the other hand, he stated earlier in the interview that the
decision to use diesel generators was made in a “split-second.”
Similarly, the decision to connect the generators to the risers was “one
of those decisions that truly took milliseconds. I said, OK we have to
get the building risers – meaning the hard-wired cables – cut them and
splice cables from the street into the riser.”
Discussion
By considering these studies with respect to the principles evaluating
interpretive field studies, recommendations may be made for how to
proceed in future studies of power and telecom restoration. Many of
the principles for evaluating interpretive studies given by Klein and
Myers (1999) speak directly to some of the challenges involved in
researching organized response to extreme events. For example, there
were close interactions between researchers and subjects throughout the
project. Indeed, some individuals in the organizations were both
subjects and points of contact, and it was through interactions with
these individuals that potential data sources and subjects were
identified. Both companies were clearly interested in seeing the results
of this work and looking for ways to apply them to their organizations.
The principle of suspicion applied here to both sides of the relationship
between researchers and subjects. For example, interim reports to key
points of contact enabled both groups to look for evidence of bias or
distortion. In practice, the principle of multiple interpretations can be
difficult to follow in situations such as these, since there is a natural
tendency in after-action reporting to construct coherent, even logical or
linear narratives to explain the observed sequence of events. Finally,
numerous personnel – particularly those who had been with the
company for extended periods of time – discussed the relevance of
prior experience in their efforts towards restoration. While the inclusion
of these observations in the study results may improve the assessment
of the study, in practice it was difficult to apply the principle of
suspicion to these observations, since they drew upon incidents that
had occurred tens of years earlier.
A variety of approaches to measuring the factors is evident from the
cases and subsequent discussion. In order of decreasing granularity,
they may be described as follows:
•
Output measures that describe the resilient performance (e.g.,
mean time to restoration). These offer limited insights.
•
Measures that describe the impact of contextual factors on
resilient performance. This approach is only useful to the extent
that these contextual factors can be measured. It does not unveil
process-level phenomena.
Process measures that show the observed relationship between inputs
and outputs, perhaps including explanations of the impact of contextual
factors.
•
Model-based explanations, which make ongoing predictions
about the processes that translate (observed) inputs into
(observed) outputs.
Associated with any of these approaches are two threats to validity.
First, post-event reports by response personnel are notoriously
unreliable and potentially invalid, particularly when cognitive demands
are unusually high. To achieve consistency (i.e., internal validity), it is
often necessary to triangulate the observations of numerous
participants. and almost certainly to give considerable weight to data
collected concurrent with the occurrence of the phenomena. Second,
the external validity is necessarily limited for all but a few cases. To
achieve generalizability (i.e., some measure of external validity), we
probably need to measure phenomena associated with these factors at a
much lower level and then aggregate the results. As an example, begin
with the study of individual processes and aggregate these – rather than
beginning with looking for results at the group or organizational level.
Concluding Comments
A number of observations from the conduct of this study may be used
to improve the quality of further research into measuring factors
thought to contribute to resilience. In the case of power restoration,
margin and tolerance have been assessed according to the behavior of the
physical system. Yet even with such assessments, it is difficult –
perhaps even impossible – to evaluate performance on this case against
some theoretical optimum.
Even post-event assessments are
challenging, leading to the use of measures of relative performance or
efficiency. Engineering estimates of anticipated system performance
tend to be heavily informed by expert judgment rather than historical
data (National Institute for Building Sciences, 2001). Evidence of
flexibility in the power case is found in the company’s efforts at revising
organizational structure, but also its activities and which people are
assigned to the activities. The design of the physical system may have
helped determine the organizational structure, a question that might be
further investigated through other studies of physical change in
infrastructure systems.
Given the practical difficulties of developing and using measures that
can be used in assessing – and eventually engineering – organizational
resilience in response to extreme events, it is reasonable to plan on
achieving understanding through the use of multiple methods. Given
the importance of field studies to these efforts, the evaluation of results
may benefit from an assessment of the methods of these studies to
principles for evaluating interpretive field studies. This study has
illustrated the possible benefits (and complications) of triangulating
observations using both quantitative and qualitative methods. Yet it is
certainly the case that a broader and more comprehensive range of
observation techniques and analytic methods will be necessary in order
to inform theory about how to engineer resilience in power and
telecommunications infrastructures. Access to a wide variety of preand post-event data sources may facilitate this process, but this access
must be negotiated through organizational gatekeepers – further
emphasizing the need to embrace the principles discussed here. There
are certainly opportunities for the development of technologies that
may support measurement concerning organizational boundaries.
Concurrently, better approaches must be developed for capturing data
associated with low-level phenomena in order to support analysis of
organizational resilience at a broad level.
References
Boudreau, M. C., Gefen, D., & Straub, D. W. (2001). Validation in
Information Systems Research: A State-of-the-Art Assessment.
MIS Quarterly, 25(1), 1-16.
Carmines, E. G., & Zeller, R. A. (1979). Reliability and Validity
Assessment. Newbury Park, CA: Sage Publications.
Denzin, N. K. (1978). The Research Act: A Theoretical Introduction to
Sociological Methods. New York: McGraw-Hill.
Drabek, T. (1985). Managing the Emergency Response. Public
Administration Review, 45, 85–92.
Fiksel, J. (2003). Designing Resilient, Sustainable Systems. Environmental
Science and Technology, 37, 5330–5339.
Flanagan, J. C. (1954). The Critical Incident Technique. Psychological
Bulletin, 51, 327-358.
Hoffman, R. R., Crandall, B., & Shadbolt, N. (1998). Use of the Critical
Decision Method to Elicit Expert Knowledge: A Case Study in
the Methodology of Cognitive Task Analysis. Human Factors,
40(2), 254-276.
Hollnagel, E., & Woods, D. (2006). Epilogue: Resilience. Engineering
Precepts. In E. Hollnagel, D. Woods, & N. Leveson (Eds.),
Resilience Engineering: Concepts and Precepts. Aldershot, UK: Ashgate.
Jick, T. D. (1979). Mixing Qualitative and Quantitative Methods:
Triangulation in Action. Administrative Science Quarterly, 24(Dec.),
602–659.
Klein, G., Calderwood, R., & MacGregor, D. (1989). Critical Decision
Method for Eliciting Knowledge. IEEE Transactions on Systems,
Man and Cybernetics, 19, 462-472.
Klein, H. K., & Myers, M. D. (1999). A Set of Principles for
Conducting and Evaluating Interpretive Field Studies in
Information Systems. MIS Quarterly, 23(1), 67–94.
Kreps, G. A. (1991). Organizing for Emergency Management. In T. E.
Drabek, & G. J. Hoetmer (Eds.), Emergency Management: Principles
and Practice for Local Governments. Washington, D.C.: International
City Management Association, 30-54.
Kreps, G. A., & Bosworth, S. L. (1993). Disaster, Organizing and Role
Enactment: A Structural Approach. American Journal of Sociology,
99(2), 428-463.
Mendonça, D. (2007). Decision Support for Improvisation in Response
to Extreme Events. Decision Support Systems, 43(3), 952–967.
Mendonça, D., & Wallace, W. A. (2003). Studying Organizationallysituated Improvisation in Response to Extreme Events. Newark,
NJ: New Jersey Institute of Technology.
Mendonça, D., & Wallace, W. A. (2007a). A Cognitive Model of
Improvisation in Emergency Management. IEEE Transactions on
Systems, Man, and Cybernetics: Part A, 37(4), 547–561.
Mendonça, D., & Wallace, W. A. (2007b). Impacts of the 2001 World
Trade Center Attack on New York City Critical Infrastructures.
Journal of Infrastructure Systems, 12(4), 260-270.
Miner, A., Bassoff, P., & Moorman, C. (2001). Organizational
Improvisation and Learning: A Field Study. Administrative Science
Quarterly, 46(June), 304-337.
Moorman, C., & Miner, A. S. (1998). Organizational Improvisation and
Organizational Memory. Academy of Management Review, 23(4), 698723.
National Institute for Building Sciences (2001). Earthquake Loss
Estimation Methodology HAZUS99 SR2, Technical Manuals IIII. Washington, DC: National Institute for Building Sciences.
O'Rourke, T. D., Lembo, A. J., & Nozick, L. K. (2003). Lessons
Learned from the World Trade Center Disaster about Critical
Utility Systems. In J. L. Monday (Ed.), Beyond September 11th: An
Account of Post-Disaster Research. Boulder, CO: Natural Hazards
Research and Applications Information Center, 269-290.
Orlikowski, W. J., & Baroudi, J. J. (1991). Studying Information
Technology in Organizations: Research Approaches and
Assumptions. Information Systems Research, 2(1), 1–28.
Perry, R. (1991). Managing Disaster Response Operations. In T.
Drabek, & G. Hoetmer (Eds.), Emergency Management: Principles
and Practice for Local Government. Washington: International City
Management Association, 201-224.
Stewart, T. R., & Bostrom, A. (2002). Extreme Event Decision Making:
Workshop Report. Albany, NY: University at Albany.
Turner, B. A. (1995). The Role of Flexibility and Improvisation in
Emergency Response. In T. Horlick-Jones, A. Amendola, & R.
Casale (Eds.), Natural Risk and Civil Protection. London: E.&F.
Spon, 463-475.
Vidaillet, B. (2001). Cognitive Processes and Decision Making in a
Crisis Situation: A Case Study. In T. K. Lant, & Z. Shapira (Eds.),
Organizational Cognition: Computation and Interpretation. Mahwah, NJ:
Lawrence Erlbaum Associates, 241-263.
Walsham, G. (1995). Interpretive Case Studies in IS Research: Nature
and Method. European Journal of Information Systems, 4(2), 74–81.
Webb, G. R. (2004). Role Improvising during Crisis Situations.
International Journal of Emergency Management, 2(1-2), 47-61.
Webb, G. R., & Chevreau, F.-R. (2006). Planning to Improvise: The
Importance of Creativity and Flexibility in Crisis Response.
International Journal of Emergency Management, 3(1), 66–72.
Weick, K.E. (1985). Systematic Observational Methods, in G. Lindzey
and E. Aronson (Eds.), The Handbook of Social Psychology, 567-634,
Random House, New York.
Woods, D. (2006). Essential Characteristics of Resilience. In E.
Hollnagel, D. Woods, & N. Leveson (Eds.), Resilience Engineering:
Concepts and Precepts. Aldershot, UK: Ashagate.