mANAgINg thE DESIgN Of pErfOrmANcE mEASurES

7 ppmr / September 2008 Managing the Design of
Performance Measures
The Role of Agencies
Yi Lu
State University of New York–Binghamton
ABSTRACT: How to design measures that are technically sound and politically
acceptable has puzzled researchers and practitioners alike. This article presents
two main research questions: How are performance measures selected, and does
the process by which performance measures are selected make a difference in
achieving high measurement quality? By triangulating data collected through
interviews and surveys, this paper examines the implications of an agencycentered bottom–up measurement-design process. This examination identifies
various ways in which measures are selected and calls for an active approach
involving agencies, external professionals, and the central budget office to improve
measurement quality.
KEYWORDS: agency, measurement design, performance
The quality of performance measurement systems is often assessed at two levels:
(a) the technical quality of performance data and (b) the utilization of performance
information (Wholey, 2006). There is no doubt about the importance of these two
aspects; however, how to put in place accurate and useful performance measurement systems is far from resolved in this field. An agreed-upon design/standard for
performance measures remains the core of this puzzle. Why? Developing useful
performance measures is an often-mentioned difficulty in performance budgeting
and management. In this research, I conduct surveys and interviews with state
agencies as well as executive and legislative budget offices in Georgia to (a) study
the performance measures development process, (b) examine various measurement selection methods and their implications for measurement quality, and (c)
discuss the ramifications of the design process. I offer a categorization of issues
in performance measurement identified by the existing literature and propose a
framework to understand the dynamics of measurement quality in terms of how
Public Performance & Management Review, Vol. 32, No. 1, September 2008, pp. 7–24.
© 2008 M.E. Sharpe, Inc. All rights reserved.
1530–9576/2008 $9.50 + 0.00.
DOI 10.2753/PMR1530-9576320101
7
8 ppmr / September 2008 various selection methods affect the design quality of measurement. I also discuss the advantages and disadvantages regarding the agency-centered bottom–up
process of measurement design. The purpose of this paper is to reveal the design
process of performance measures, especially from the perspective of the state
agency, to improve measurement quality.
The Puzzle of Performance Measurement
Performance measurement is the label given to the many efforts undertaken within
the public sector to meet the new demand for documentation of results (Wholey
& Hatry, 1992). Specifically, it is the “regular measurement of the results (outcomes) and efficiency of services or programs” (Hatry, 2006, p. 3; also see Hatry,
1999). The practice of performance measurement is not new in the public sector
(Williams, 2003). Performance management concepts maintain a long-established
history, but they have failed to gain an accepted approach among practitioners
and academics alike.
The literature reveals a wide range of views regarding performance management
concepts. Furthermore, debates continue among scholars as to the best approaches
to institute (e.g., see Forsythe, 2001). Generally, the issues of measurement identified by the literature can be categorized into four dimensions: measurement validity
and reliability, usefulness, administrative feasibility, and political acceptance of
measures. First, validity of measurement refers to the extent to which a measure
captures what it assumes to measure. It is considered a significant component of
successful performance measurement (Grizzle, 2001; Melkers & Willoughby,
2004; Willoughby, 2004). For some programs, such as those in federal research,
meaningful measurement of performance proves extremely difficult if not impossible to capture fully (Wholey & Hatry, 1992). Reliability is another component
that relates to measurement consistency and stability. A measure that is used by
different raters through time and yields similar results demonstrates a high level
of stability. Both validity and reliability are considered nagging issues in the
public sector because of the multiple constructs of public service performance
and the various lenses through which public performance is evaluated by different
stakeholders (Brewer, 2006).
Second, how to make measures useful has attracted growing attention (Hatry,
Gerhart, & Marshall, 1994). It has become widely recognized that having valid
measures is not an end in itself and that usefulness is a distinct construct from
validity and reliability. A measure can be valid and reliable but not useful if the
following issues are unsettled: data are too highly aggregated to be meaningful
for street-level personnel, difficulty in setting a performance target, and difficulty
in identifying a useful match between measures and their purposes (Hatry, 1997).
As Behn forcefully argued, these issues must be seriously addressed before we
Lu / Managing the Design of Performance Measures 9
can “select measures with the characteristics necessary to help achieve each purpose” (2003, p. 587). Third, feasibility refers to a myriad of issues such as ease of
obtaining data, availability of performance information via electronic databases,
and financial and human resources for developing measures (Wang, 2000). Finally,
the issue of political acceptance refers to the trust in measures by the stakeholders
other than the developers.
Measures of performance are often subjective and arbitrary in selection (Andrews, Boyne, & Walker, 2006). Consequently, Pandey and Moynihan suggested
that the selection of measures should begin with “an explicit recognition of
underlying choices and resultant trade-offs” (2006, p. 137). Newcomer (1997)
suggested that defining performance is an inherently political process and the
decision about what to measure reflects two key factors: the intended uses of
measures and the value or priorities of the stakeholders selecting the measures.
Therefore, disagreements among stakeholders are not unusual when selecting
performance measures. The Office of Management and Budget (OMB) rated 22
percent of the 977 assessed programs as “results not demonstrated” if OMB and
the agency could not agree on performance measures or performance information
was inadequate (OMB, 2007). This high percentage of disagreement regarding
useful performance measures highlights the differences of philosophies among
public officials.
Clearly, each of the four categories of issues represents a different challenge
to a successful performance measurement. In their work on federal job training
programs, Courty and Marschke (2003) provided a vivid account of the complexities involved in developing a performance measurement system. Although
Joyce (1993) pointed out that the short-run emphasis should remain with the
development of performance measures, measurement quality remains a key issue for administrators/managers 15 years later. Thus, the puzzle of performance
measurement continues.
I suggest that a framework to understand the dynamics of measurement quality
starts with understanding the ways in which performance measures are designed
(see Figure 1). The underlying assumption is that the process and methods of
measurement design that we employ link to measurement quality. In terms of
participants in the process, agencies are viewed as the center of most efforts to
produce performance information in the budget process (Joyce, 2003; Lu, 2007;
Moynihan & Ingraham, 2003; Willoughby & Melkers, 2000). Due to the pivotal
roles of agencies in this process, an in-depth research of their functions is in
order. I use an in-depth approach that not only involves a detailed understanding of the roles of agencies but also includes an appreciation for the interactive
relation among agencies and other participants in the process. The relative influences of other participants in the process are very important for the success of
the measurement system because different institutional participants and functions
10 ppmr / September 2008 Measurement quality
Administrative feasibility
Political acceptance
Validity and reliability
Usefulness
Selection methods of performance measures
Figure 1. Dynamics of Measurement Quality
of government needs require various measurement forms to meet their decision
needs (Radin, 2000).
Alongside the participants, equally important are the various methods used to
choose measures. The fundamental question remains: Why do agencies prefer
measure A over measure B in the measurement design process? The preceding
framework helps to unveil the reasoning behind these decisions because it relates
the four categorizations of issues as interconnected conductors to measurement
quality. That is, to understand the dynamics of measurement quality is to understand
how and why various methods affect these four conductors of measurement quality. A successful measurement design process should involve selection methods
that necessitate a continuous fine-tuning of one or more conductors with an eye
on the remaining ones. This approach is necessary because the conductors should
work together to balance the demands for both technical capacity and political
acceptance. I hypothesize that the selection methods of measures have differing
impacts on these conductors.
Specifically, this study addresses the following research questions:
1. What role do agencies play in the processes used to develop performance
measures?
2. What are the methods used to select performance measures?
3.How do the methods by which measures are selected make a difference, if any,
in achieving high measurement quality?
Lu / Managing the Design of Performance Measures 11
4. What are the implications of this process for the development of measures, and
what are the solutions to their disadvantages?
Research Design
The study involves interviewing and surveying agency fiscal/budget officers in
the State of Georgia. The State of Georgia provides a useful research site because
of many existing studies regarding performance measurement (Douglas, 1999;
Huckaby & Lauth, 1998; Lauth, 1978, 1985, 2004; Lu & Facer, 2004) and the
state’s long history of performance-driven budget reforms. The complexities of
state performance measurement systems do not allow us to know with certainty the
generalizability of the Georgia experience among other states. However, Georgia
does share some similarities with other states, which may allow for certain generalizable comparisons among select states. For instance, it is one of many states
in which the governor is responsible for budget preparation,1 where performance
measures are developed. Finally, this research provides a basis for a comparative
study on the dynamics of measurement quality.
For the purposes of this study, all interviews were conducted between July 2005
and January 2006 with 31 of 35 fiscal/budget officers (89 percent) associated with
agencies listed in the Executive Branch section of the Governor’s Budget Report
(Amended FY 2005 & FY 2006).2 The interview protocol consisted of a set of
open-ended questions about their experience in designing measures. The average
interview lasted one hour. Interviews were recorded after receiving consent from
the interviewees and were transcribed into written notes after the sessions; for those
not recorded, notes were prepared during and after the interviews. Confidentiality
was promised to each of the interviewees in writing.
To mitigate possible limitations in qualitative analysis, surveys were conducted
with both fiscal/budget offices and agency heads in all entities, including large
agencies, attached agencies, and authorities. In total, 194 surveys were mailed to
agencies, with a response rate of 65 percent.
Furthermore, to account for possible differences in opinion among different
branches of Georgia state government, additional interviews and surveys were conducted with executive budget office directors, executive budget analysts, and House
and Senate budget analysts. I used surveys and interviews with other branches of
state government to supplement and clarify findings from state agencies.
Content analysis was applied to study interview notes. The interview notes
were analyzed and categorized in relation to a research question. This process was
iteratively conducted with responses from each state agency and for each research
question. After the iterative process, similarities and differences among responses
were compared and patterns uncovered to address research questions. Specifically,
to examine the process of measurement design (research question 1), the interview
12 ppmr / September 2008 notes are summarized to show the general pattern in the process of performance
measurement design in terms of who the main participants were and their relative
influences. The focus of analysis was directed at the roles and relative influences
of institutional participants: agencies, central budget office, and legislature. Survey
responses were employed to triangulate the validity of accounts given by interview
respondents. To address the factors influencing the selection of measures (research
question 2), interview notes were coded and ranked by the frequency with which
each factor was mentioned by respondents. Interviewees’ account of the rationale
for each selection method is summarized and provided. To address how selection
methods impact measurement quality (research question 3), I divide the data
sample into two subsets: those who claim their measurement is trustworthy and
those who claim their measurement needs work. Measurement quality is defined
as the perception by agencies of the quality of measures. These two groups are
compared and contrasted to distinguish the patterns in their selection methods.
Lastly, the possible solutions for the disadvantages of an agency-centered measurement design process are identified and summarized. The overall purpose is
to understand the process for designing performance measures.
Findings
The Process of Measurement Design
In general, the overall performance system involves four phases: performance
initiation, measurement design, system implementation, and information utilization. I focus on the measurement design phase, although I also briefly identify
and summarize the key activities in the other three phases.
In Georgia, performance budgeting is initiated by the central executive budget
office, the Office of Planning and Budget (OPB), which issues the budget preparation procedures for each fiscal year. The role of the OPB is to set the guidance and
parameters of performance measurement. The OPB requires that each program
(including subprograms) have at least one performance and one result measure,3
be submitted by a certain date, and be developed in conjunction with a given
agency’s strategic and annual business plans. Georgia’s Planning and Budgeting
for Results Model showed the end product from the OPB-led performance initiation phase (Governor’s OPB, 2004).
Although the OPB initiates the process of performance measurement, agencies
must carry out the design of performance measurement. During the measurement
design phase, while taking the OPB’s guidelines into consideration, agencies hold
strategic planning and budget meetings. These meetings are often brainstorming
sessions during which each agency discusses its budget issues, program goals, and
possible measures. Interestingly, although meeting involvement varied, the common practice routinely includes three groups of participants: agency management,
Lu / Managing the Design of Performance Measures 13
fiscal/budget officers, and division/program managers. Agencies with considerable
experience in performance measurement tend to have a central analytical staff in
charge of agency strategy and planning that participates in the process. Each of
these groups plays a different role in the meeting. The agency management sets
the visions and goals for the agency. The management also approves performance
measures. The fiscal/budget officer passes the requirement of the central executive
budget office to the agency and often serves as the central keeper of performance
data for the agency. The division/program managers are the main providers of
information as to which measures are, or could be, used. Often, the target performance level of each measure is decided between division/program managers
and the agency management in subsequent one-on-one meetings. Some agencies
conduct focus group analyses, constituent meetings, or customer surveys to aid
the development of budgets and measures prior to brainstorming sessions. Moreover, some agencies differentiate three types of measures: those used for agency
operation and management (referred to as informal measures), those submitted
to the OPB via the official computerized budget system (formal measures), and
those that the OPB’s individual analysts request when the OPB considers budget
requests (additional measures).4 This division suggests that defining the unit of
useful measurement for agencies and the central budget office simultaneously
is difficult. Agencies also revisit previous measures during the selection phase.
Although agencies more or less have a process to design performance measures,
this feature is considered distinct from the budget preparation process.
The OPB’s input in the measurement design phase is limited. It tends not to
participate in the selection of informal measures. With regard to formal measures,
the OPB’s participation ranges from no participation at all to constant exchanges
between agency officers and corresponding OPB analysts. The OPB’s degree of
involvement depends on its working relationship with an agency and the working
style of each agency/analyst. Once the measures are designed, they are reviewed
by the OPB. In most cases, OPB analysts take the formal measures as they are
drafted and request additional measures and data as they deem necessary for the
issues they analyze. Overall, citizens and legislatures seldom participate in the
design phase of the performance process. All performance measures and agency
strategic plans are reported annually with the budget process. The central budget
office publishes performance measures annually on its agency Web site.
During the implementation phase, agencies again play the most significant role
of all participants. The common activities include staff training, data collection
and computerization, and evaluation of the validity of both measures and data.
Utilization takes place, with a wide range of frequency and intensity, throughout
the year. The participants and their activities in each phase are summarized in
Table 1.
Survey data, shown in Table 2, provide a picture that is consistent with the
14 ppmr / September 2008 Table 1. The Key Participants and Activities in
Performance Measurement Process
Phases
Key participants
Performance
initiation
Central Executive
Budget Office
Measurement
design
Individual agencies
(agency management, fiscal/budget
officers, division/
program managers)
and Central Executive Budget Office
Key activities
• Communicate governor’s performance initiatives
• Set the minimum requirement of performance
measurement and reporting
• Put together a financial/budget system to consolidate and process performance information
Individual agencies:
• Set agency missions and goals to relate to
measures
• Revisit past year experience of measurement
• Hold strategic planning and budget meeting(s)
to discuss measures
• Performance target negotiation
• Fulfill the requirements of the Central Executive
Budget Office
Central Executive Budget Office:
• Function at various capacities depending on
the working relationship with agencies and the
operational style of each agency
System
Individual agencies
implementation
• Staff training
• Data collection and computerization
• Data analysis
• Evaluation of the validity of both measures
and data
Information
utilization
• Utilization takes place, with a wide range of
frequency and intensity. Compared with the
Central Executive Budget Office and the legislature, individual agencies use performance
information most frequently.
Individual agencies,
Central Executive
Budget Office, and
legislature
information as discussed. Survey respondents were asked to identify the most
important participants in the development of a performance measurement system. Respondents identified the following participants: agency heads (mean =
8.28, on the scale of 10) and program staff (mean = 7.89) as the most important
participants, followed by the governor (mean = 7.41) and the OPB (mean = 6.63).
The OPB respondents also ranked agency participation at the top, whereas House
and Senate budget offices rated agency’s engagement slightly below appropriation
committees, governors, and elected officials.
In sum, Georgia’s measurement design process exhibits the following features.
First, although the central budget office initiates the process with top–down instructions, individual state agencies actually take on the detailed work. Outside agencies
are not active participants in the measurement process. In some cases, agencies
Lu / Managing the Design of Performance Measures 15
Table 2. Participants in Selecting Measures (The Comparative View)
Participants
Agency head
Agency and program staff
Governor
OPB
House and Senate budget offices
Appropriation committees
Citizen
Elected officials
Professional
State audit office
Response rate (%)
Number of surveys received
Agency
mean
8.28
7.89
7.41
6.63
5.12
5.09
4.95
4.06
3.63
3.20
65
110
Office of
Planning and
Budget (OPB)
mean
9.00
9.25
6.25
8.50
5.67
3.67
3.33
1.00
4.00
4.67
17a
4
House and
Senate budget
offices mean
6.70
6.00
7.00
6.40
5.70
7.00
4.20
6.90
3.40
4.10
83
10
a
The executive budget office (i.e., the OPB in Georgia) experienced a turnover rate (about 25
percent) during the data collection period of this research. Many analysts have not sat through one
budget cycle. On a couple of occasions, phone calls and e-mails were received from OPB analysts
who felt unable to complete the survey due to limited working duration with OPB. However, those
who did participate in this research tended to be senior people in OPB. The low response rate is
likely due in part to the turnover rate at that office.
Note: Responses are based on a scale from 1 (least important) to 10 (most important).
in Georgia almost exclusively selected the measures they would like to use in the
process. Second, developing performance measures is mostly a bottom–up process.
In general, the opinions of program managers and staff are valued and respected
by agency heads. Additionally, the process of measurement design varies among
agencies, as they seem to internalize the central budget office’s instructions. In
short, the paradox of “top–down direction for bottom–up implementation” (Long
& Franklin, 2004, p. 309) is present. Third, contrary to the details in the Georgia’s
Planning and Budgeting for Results Model, agencies tend to view the review/
design of performance measures as a separate piece from budget preparation even
though the timing of design matches the budget preparation phase.
Methods in Selection of Measures
I now focus on the question: How are measures selected in the first place? Because
the process suggests that agencies are the primary designers of measures, agencies
are the best place to look for responses to this question. The top factors influencing
the selection of one set of measures versus another set include (a) previous measures; (b) strategic plan, mission, and agency head’s vision; (c) program/division
managers’ input; (d) data availability; (e) performance information requested by
stakeholders in the budget cycle; and (f) professional and national standards.
First, designing measures has historically been an incremental process. Inter-
16 ppmr / September 2008 viewees often described their processes of measurement design as starting from
an examination of previous measures. The typical question asked, “Do we want to
revise anything we did last year?” Agency heads contend that government services
do not change significantly from one year to the next; therefore, it makes sense
to continue using like measures. In addition, the historical approach provides a
level of trust and comfort that is needed to sustain the performance system. In
some cases, this consistency of measures is strongly encouraged or even required.
Some agency fiscal officers believed that frequent changes in measures tended to
contribute to high administrative costs as well as generate data without a historical base. To many agencies, using previous years’ measures is not only legitimate
practice but also a reliable way to jumpstart design, as performance measures are
part of the institutional memory. In short, interviewee responses indicate that utilization of previous measures contributes to the following issues in the framework:
administrative feasibility, political acceptance, and measurement reliability.
Strategic plans, mission statements, and agency heads’ goals combine to form
another factor. The role of agency heads proves key in the development of performance measurement. Agency heads lead the strategic planning meetings, articulate
missions and goals, and set directions for agencies. As a number of interviewees
noted, their measures originally came out of the strategic planning process, and
their measurement processes started with the business missions and goals.5 To
many agencies, especially those that have been doing strategic planning for several years, strategic planning is not only a process, it is also an action plan with
measures as its manifestation. This method of selection is believed to contribute
to usefulness and acceptance of performance measures in the agency.
The third factor in deciding which measures to use is the input by program/
division managers. As previously mentioned, the process of performance design
is mostly a bottom–up process; therefore, developing measures, as one interviewee characterized it, “is in the hands of divisions” and is “a group process.” The
opinions of program/division managers are respected in most cases by the agency
management because, as one interviewee explained, “they are the people who
know most about what their divisions are doing.” In addition, there is a clear sense
among interviewees that manager participation is critical because their participation increases the credibility of the proposed measures. An interviewee’s comment
captures the essence: “Developing measures is a two-way street of communication
within the agency.” In short, inputs from program/division managers are most cited
by respondents for their contribution to administrative feasibility, usefulness for
and acceptance by agencies, and measurement validity.
The factor of data availability is important to administrative convenience and
feasibility. Data availability encompasses three main issues with regard to performance measures. First, are high-quality data available? Second, in what form are
the data available? Third, what is the cost to access the data? In many cases, the
Lu / Managing the Design of Performance Measures 17
data are not available at all, or are available only in paper version (i.e., not electronically). Many interviewees noted that the most difficult measure to capture is
customer (or citizen) satisfaction. Interviewees noted the difficulty in identifying
customers who would be able to provide a legitimate evaluation of services for
performance information. One fiscal officer’s comment captures similar opinions
offered by several others: “The performance system is only as good as the data
itself. There is no good in collecting data that is not reliable.” To agencies, performance measurement should be the by-product of their work, not the focus of
it. If a measure needs a substantial amount of administrative work in collecting
data, then, as one fiscal officer straightforwardly noted, “We will not do it.” Data
availability is important for all four issues in the framework.
There are also situations in which certain measures are selected because they
are frequently asked for by others in the budget cycle, including the central executive budget office and the legislature. For example, the executive budget analysts
in OPB frequently ask agencies for, in addition to formal measures, additional
performance measures related to budget issues. As some fiscal officers noted, performance budgeting to them boils down to “OPB needs such and such performance
data by such and such date.” In this case, agencies select measures that they know
the central budget office will ask for as long as the same budget analyst works
with the agency. In addition, measures are sometimes added because legislators
are interested in a certain policy agenda. That certain measures are asked for by
others (i.e., state legislators) in the budget process signals to agencies that these
measures and their performance information are important public functions. In
some cases, as one interviewee concisely pointed out, measures are “created as a
reactive response to all.” However, the challenge is how to walk the fine line. An
interviewee captures the challenge facing agencies, namely, that measures should
focus on who the agency serves:
Citizens, [agency’s clientele], or governor, they all have the same goal, but they all
see things a little bit different. We have to make sure that the performance measures
are reaching all the customers. . . . We should understand from what perspectives
your measures ought to be described.
These comments indicate that this factor is most likely to contribute to political acceptance.
Although few agencies utilize professional and national standards as measures,
interviewees from agencies that do use them stated that using these measures provides greater science and consistency in calculating the measures and, therefore,
provide a greater opportunity to benchmark. Comments made by respondents suggest that involvement by external professionals could be instrumental for validity
and reliability, usefulness, and administrative feasibility in the framework.
In short, the comments made by agencies clearly unveil the stories and reasons
behind selected measures. The findings captured indicate the relative frequency of
18 ppmr / September 2008 each primary selection method. It is important to realize that agencies tend to use
a combination of several methods. Future research should focus on the interactive
relation among these methods.
Impact of Selection Methods on Measurement Quality
The previous discussion examines the various primary methods used in selecting
performance measures. Hence, the next logical question is whether the methods
by which measures are selected affect measurement quality differently. To address this question, I compare the selection pattern from those who perceived
that measurement quality is high with the pattern from those who perceived a
low measurement quality. Sixty-eight percent of the agency respondents found
the quality of measures to be high and 32 percent found the quality of measures
needing improvement. Table 3 reports the percentage of respondents who claimed
to use a particular method in their selection of measures. All respondents of both
groups indicated that their performance measurement design starts with an existing
design (previous measures). Among the group who claim to have high measurement quality, 84 percent and 91 percent note the involvement of agency staff and
agency heads, respectively (vs. 81 percent and 88 percent in the needs improvement
group). In addition, 34 percent of the high measurement quality group reported
the use of external professionals (vs. 14 percent in the needs improvement group).
These findings indicate that inputs by program managers, agency heads/strategic
planning, and external professionals contribute to measurement quality (i.e., an
active approach). The positive impact of this participatory approach in performance
measurement (deLancer Julnes, 2001) is further confirmed in this research.
On the other hand, the findings of this research should not be interpreted in a
manner suggesting that it is not useful to select a measure because it was used
before, the data are available, or it is requested by other stakeholders in the process.
In fact, the findings in the previous section explain the rationales for using these
selection methods. However, the research does indicate that a passive approach that
is driven by previous measures, existing data availability, and incidental requests
of performance measures tends to lead agencies to perceive the measurement quality to be low. Various selection methods do not equally influence measurement
quality. A more promising approach to improve measurement quality could be
involving and training agencies to increase their focus on measurement quality
and borrowing expertise from external professional.
Advantages and Disadvantages of the Agency-Centered
Process of Measurement Design
Both the interviews and surveys suggest that the process for developing measures
proves agency-centered and bottom–up in construction and agencies are versatile
Lu / Managing the Design of Performance Measures 19
Table 3. Impact of Selection Methods on Measurement Quality
Selection methods
Previous measures
Agency heads/strategic planning
Program managers/staff
Data availability
Requests by external stakeholders
External professionals
High measurement
quality (%)
100
91
84
38
32
34
Low measurement
quality (%)
100
88
81
42
42
14
in their selection methods. Why are agencies at the center of this process? Information garnered from interviews with agencies, the central budget office, the
House and Senate budget offices, as well as various budget documents, all seem
to suggest that this arrangement results from practical reasons. Neither the central
budget office nor the legislative budget offices have the staff or the expertise to
pinpoint measures.
The OPB has limited staff to oversee its myriad functions. In Georgia, approximately 8 policy planners deal with giving guidance in designing measures,
and 23 policy analysts make budget recommendations using measures. On the
other hand, approximately 200 agencies operate in Georgia, including attached
agencies and authorities, and the Governor’s Annual Budget Report identifies
about 280 programs. An agency budget officer noted, “OPB is overwhelmed by
the number of performance measures that they receive, not to mention suggesting
which measures are better or analyzing performance information in any detail.” In
addition, the OPB views itself functioning more as an adviser than as a decision
maker when it comes to developing measures or setting performance targets. One
OPB interviewee said, “I do not see OPB coming in, on behalf of agencies, setting
or adjusting performance measures or targets.”
House and Senate budget offices face difficulties like the OPB’s in designing
performance measures and maintain even smaller staffs than the OPB. In Georgia,
House and Senate budget analysts number approximately eight and four, respectively. Furthermore, the General Assembly is even less equipped to help design
measures because it meets for only 40 legislative days each year.
The advantages of this agency-centered process of measurement design are
most apparent when viewed in this context. The workload of measurement design
gets distributed among all agencies. Agencies presumably know best about their
programs and, therefore, understand which measures to utilize. In addition, an
agency-centered process encourages greater buy-in from agencies. For example,
an interviewee noted, “If we did not assign agencies to design measures, then we
probably would end up in a situation where agencies are held accountable for
measures they have little or no faith in.”
20 ppmr / September 2008 Some interviewees are hesitant and cautious about this particular approach to
measurement. The main concerns raised by interviewees include: (a) agencies
would not select measures that may embarrass them, (b) agencies would not set
performance targets stretching their maximum capability, (c) legislatures may
lose track of measurement quality, and (d) nonagency participants may reduce
their trust in measures.
Unfortunately, no quick or easy solutions to these concerns exist. Asked about
how to deal with these issues, agency heads and budget analysts responded in one
or more of the following four manners:
• Trust agencies: Some interviewees argued that most public employees possess
good will. One agency interviewee noted an often-cited sentiment:“[W]hen you
have a team together designing measures, you have a good chance of getting it
right.” In addition, given the increasing opportunity for benchmarking, the chance
of manipulation might diminish.
• Ask for additional measures: Several interviewees indicated that they routinely
asked for their own measures relevant to budget matters at hand, asking agencies
to compile and independently evaluate these measures.
• Apply a different model: Budgeting should be based on comprehensive evaluations
(i.e., the evaluation model) rather than two or three performance measures (i.e.,
the performance measure model), not to mention the agency-led measurement
design. For interviewees, isolated performance measures did not adequately
justify resource allocations. Many interviewees proposed a segmental evaluation
model in which programs take turns being fully evaluated at an interval of every
three to five years.
• Engage the central budget office more actively in the process of measurement
design: This suggestion, which proved the most-mentioned solution among
interviewees, confirms the role of central guidance in agency efforts and points
to the importance of balance between top–down and bottom–up approaches
(Moynihan & Ingraham, 2003). Although agencies are the center of producing
measures, the need for central guidance is imperative in an effective process.
In short, this largely bottom–up process of measurement design is due to practical reasons resulting from the daily operations of public agencies. Although this
research identifies several potential solutions to maximize performance measurement, the search for additional alternatives/solutions must continue.
Conclusion
Agencies are at the center of producing performance measures in budgeting and
management (Joyce, 2003). Yet, how an agency selects a measure is rarely transparent to outsiders. This largely qualitative research provides a detailed account of
what it means for agencies to play a central role in selecting performance measures
as well as their intricate relationships with other stakeholders in the process. This
Lu / Managing the Design of Performance Measures 21
study examines the process of performance design, the science and art of developing measures, and the dynamics of measurement quality. The study is a response
to Joyce’s call that “a concern for measuring government performance should
simply be a concern for measuring it correctly” (1993, p. 3). Clearly, measuring
performance correctly does not come naturally; it must be managed.
The findings confirm that the current process of measurement design is largely
an agency-centered and bottom–up process. Performance measures are selected
through a variety of methods. The leading stated reasons for using a measure are
its prior use and its selection by program managers. However, the framework for
the dynamic of measurement quality helps us understand that many reasons lie
behind the ways in which measures are selected. More important, this study finds
that an active approach to performance measurement involving agency staff, agency
heads, and external professionals improves the perception of measurement quality.
Although the agency-centered and bottom–up process is not without concerns,
my findings indicate that corrective solutions might well lie in the role of central
budget office in guiding the process.
Wholey stated, “It is preferable to consider ways to ensure reasonable quality
of the performance measurement process from the beginning” (2006, p. 267).
Understanding agency practices is the first step. Future research may focus on
the characteristics of agencies leading to a particular set of selection methods. By
doing so, the research has the potential of developing individualized strategies for
agencies with various contexts and capacities to improve measurement quality.
Notes
1. Clynch and Lauth (1991) and Lauth (1986) found that Georgia is 1 of 47 states in which
the governor has direct responsibility for budget preparation and execution, and Abney and
Lauth (1998) found that Georgia is 1 of 22 states in which the governor has relatively more
influence than the legislature in state appropriation. In addition, Georgia is also 1 of 33 states
where legislated performance-based budgeting requirements are in place (Melkers & Willoughby,
2004). Nevertheless, these evidences of similarities are not proofs of, but rather indicators of,
degree of generalizability.
2. There are total 35 agencies (excluding State of Georgia General Obligation Debt Sinking
Fund) listed in the Executive Branch section of the Governor’s Budget Report, Amended FY 2005
& FY 2006. Among the 31 agencies that participated, 2 agencies have the same fiscal officer.
3. In Georgia, the OPB instructs agencies to differentiate results measures from performance measures. The former is used to assess incremental progress toward program goals, and
the latter provides data on program demand, operations, and outputs (Georgia State Office of
Planning and Budget, 2005).
4. The extent to which these three sets of measures are different varies. Some agencies use
the same measures for formal and informal measures; others use different sets of measures.
Agencies that believe that the level of detail and information revealed by a few formal measures
does not meet their information need for internal management tend to design different sets of
measures.
5. Although agencies conduct a small scale of strategic planning almost every year during
22 ppmr / September 2008 the budget preparation phase, they tend to organize a comprehensive strategic planning once
every several years. The comprehensive strategic planning is a time when agencies do an intensive realignment of strategies and measures.
References
Abney, G., & Lauth, T.P. (1998). The end of executive dominance in state appropriations.
Public Administration Review, 58(5), 388–394.
Andrews, R., Boyne, G.A., & Walker, R.M. (2006). Subjective and objective measures
of organizational performance: An empirical exploration. In G.A. Boyne, K.J. Meier,
L.J. O’Toole, Jr., & R.M. Walker (Eds.), Public service performance: Perspectives on
measurement and management (pp. 14–34). New York: Cambridge University Press.
Behn, R.D. (2003). Why measure performance? Different purposes require different measures. Public Administration Review, 63(5), 586–606.
Brewer, G.A. (2006). All measures of performance are subjective: More evidence on U.S.
federal agencies. In G.A. Boyne, K.J. Meier, L.J. O’Toole, Jr., & R.M. Walker (Eds.),
Public service performance: Perspectives on measurement and management (pp. 35–54).
New York: Cambridge University Press.
Clynch, E.J., & Lauth, T.P. (1991). Governors, legislatures, and budgets: Diversity across
the American states. New York: Greenwood Press.
Courty, P., & Marschke, G. (2003). Performance funding in federal agencies: A case study
of a federal job training program. Public Budgeting & Finance, 23(3), 22–48.
deLancer Julnes, P. (2001). Does participation increase perceptions of usefulness? An
evaluation of a participatory approach to the development of performance measures.
Public Performance & Management Review, 24(4), 403–418.
Douglas, J.W. (1999). Redirection in Georgia: A new type of budget reform. American
Review of Public Administration, 29(3), 269–289.
Forsythe, D.W. (2001). Pitfalls in designing and implementing a performance management
system? In D.W. Forsythe (Ed.), Quicker, better, cheaper: Managing performance in
American government (pp. 519–551). Albany: Rockefeller Institute Press.
Georgia State Office of Planning and Budget. (2005). Prioritized program budget (PPB),
General preparation procedures: Fiscal year 2005. Atlanta.
Governor’s Office of Planning and Budget. (2004). Prioritized program planning and
budgeting: FY06 strategic and business planning guidelines for Georgia agencies.
Atlanta. Available at www.budnet.gatech.edu/stabudinfo/06OPBPrioritizedProgram
Planning-Instructions.pdf, accessed July 23, 2008.
Grizzle, G.A. (2001). Performance measures for budget justifications: Developing a selection strategy. In G.J. Miller, W.B. Hildreth, & J. Rabin (Eds.), Performance-based
budgeting (pp. 355–367). Boulder, CO: Westview Press.
Hatry, H.P. (1997). Where the rubber meets the road: Performance measurement for state
and local public agencies. In K.E. Newcomer (Ed.), Using performance measurement
to improve public and nonprofit programs: New directions for evaluation (Vol. 75, pp.
31–44). San Francisco: Jossey-Bass.
Hatry, H.P. (Ed.). (1999). Performance measurement: Getting results. Washington DC:
Urban Institute Press.
Hatry, H.P. (2006). Performance measurement: Getting results. 2d ed. Washington, DC:
Urban Institute Press.
Hatry, H.P., Gerhart, C., & Marshall, M. (1994). Eleven ways to make performance measurement more useful to public managers. Public Management, 76(9), 15–18.
Lu / Managing the Design of Performance Measures 23
Huckaby, H.M., & Lauth, T.P. (1998). Budget redirection in Georgia state government.
Public Budgeting & Finance, 18(4), 36–44.
Joyce, P.G. (1993). Using performance measures for federal budgeting: Proposals and
prospects. Public Budgeting & Finance, 13(4), 3–17.
Joyce, P.G. (2003). Linking performance and budgeting: Opportunities in the federal budget
process. Washington, DC: IBM Center for the Business of Government.
Lauth, T.P. (1978). Zero-base budgeting in Georgia state government: Myth and reality.
Public Administration Review, 38(5), 420–430.
Lauth, T.P. (1985). Performance evaluation in the Georgia budgetary process. Public
Budgeting & Finance, 5(1), 67–82.
Lauth, T.P. (1986). The executive budget in Georgia. State and Local Government Review,
17, 56–64.
Lauth, T.P. (2004, June). Budget reform in the United States and the State of Georgia,
Paper presented at the International Conference of Democratic Consolidation and
Administrative Reform, Taipei, Taiwan.
Long, E., & Franklin, A.L. (2004). The paradox of implementing the Government Performance and Results Act: Top–down direction for bottom–up implementation. Public
Administration Review, 64(3), 309–319.
Lu, Y. (2007). Performance budgeting: The perspective of state agencies. Public Budgeting
& Finance, 27(4), 1–17
Lu, H., & Facer, R.L., II. (2004). Budget change in Georgia counties: Examining patterns
and practices. American Review of Public Administration, 34(1), 67–93.
Melkers, J.E., & Willoughby, K.G. (2004). Staying the course: The use of performance
measurement in state governments: Washington, DC: IBM Center for the Business of
Government.
Moynihan, D.P., & Ingraham, P.W. (2003). Look for the silver lining: When performancebased accountability systems work. Journal of Public Administration Research and
Theory, 13(4), 469–490.
Newcomer, K.E. (1997). Using performance measurement to improve programs. In K.E.
Newcomer (Ed.), Using performance measurement to improve public and nonprofit
programs: New directions for evaluation (Vol. 75, pp. 5–14). San Francisco: JosseyBass.
Pandey, S.K., & Moynihan, D.P. (2006). Bureaucratic red tape and organizational performance: Testing the moderating role of culture and political support. In G.A. Boyne,
K.J. Meier, L.J. O’Toole, Jr., & R.M. Walker (Eds.), Public service performance:
Perspectives on measurement and management (pp. 130–151). New York: Cambridge
University Press.
Radin, B.A. (2000). The Government Performance and Results Act and the tradition of
federal management reform: Square pegs in round holes? Journal of Public Administration Research and Theory, 10(1), 111–135.
U.S. Office of Management and Budget. (2007). Program Assessment Rating Tool (PART).
Available at www.whitehouse.gov/omb/expectmore/about.html, accessed July 13,
2007.
Wang, X. (2000). Performance measurement in budgeting: A study of county governments.
Public Budgeting & Finance, 20(3), 102–118.
Wholey, J. (2006). Quality control: Assessing the accuracy and usefulness of performance
measurement system. In H.P. Hatry (Ed.), Performance measurement: Getting results
(2d ed., pp. 267–286). Washington DC: Urban Institute Press.
Wholey, J., & Hatry, H.P. (1992). The case for performance monitoring. Public Administration Review, 52(6), 604–610.
24 ppmr / September 2008 Williams, D.W. (2003). Measuring government in the early twentieth century. Public
Administration Review, 63(6), 643–659.
Willoughby, K.G. (2004). Performance measurement and budget balancing: State government perspective. Public Budgeting & Finance, 24(2), 21–39.
Willoughby, K.G., & Melkers, J.E. (2000). Implementing PBB: Conflicting views of success. Public Budgeting & Finance, 20(1), 105–120.
Yi Lu, Ph.D., is an assistant professor in the Department of Public Administration,
College of Community and Public Affairs, State University of New York, Binghamton.
She can be reached at [email protected].