Forum Developing Standards for Empirical Examinations of Evaluation Theory American Journal of Evaluation 31(3) 390-399 ª The Author(s) 2010 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/1098214010371819 http://aje.sagepub.com Robin Lin Miller1 Keywords evaluation theory, evaluation practice, practice-theory relationship, research on evaluation Evaluation scholars have long called for research on evaluation (Mark, 2001, 2003; Shadish, Cook, & Leviton, 1991; Smith, 1993; Worthen, 2001) to provide an empirical basis for improving its theory and practice. Although calls to investigate evaluation have struck a chord in some quarters of the evaluation community, with the exception of research in the area of evaluation use, these calls have been infrequently answered. Specific frameworks and guidance on how to study evaluation have seldom been offered, which may contribute to the relative dearth of research on evaluation. This article will focus on one area in which research on evaluation is sorely needed, the relationship between theory and practice. In this article, I develop a preliminary framework for studying critical aspects of the value of theory to practice. Why Study Theory’s Use in Practice? Evaluation theories are intended to provide evaluators with the bases for making the myriad of decisions that are part of designing and conducting an evaluation. Evaluation theories provide practitioners with ideological perspectives on evaluation (Smith, 2007), sensitizing concepts to guide practice and, to varying degrees, with specific guidance on matters such as defining the appropriate role of the evaluator in relationship to the evaluand and to individuals in the settings which house it (e.g., Ryan & Schwandt, 2002; Skolits, Morrow, & Burr, 2009); selecting evaluation questions and pairing these with methods (e.g., Greene, 2007; Mark, Henry, & Julnes, 2000; Rossi, Lipsey, & Freeman, 2004); determining whose informational needs are to be met via the evaluation (e.g., Abma & Stake, 2001; Greene, 1997; Mark & Shotland, 1985); selecting who may participate in shaping the direction of the evaluation and in what fashion (e.g., Cousins & Earl, 1992; Cousins & Whitmore, 1998; Fetterman, 1994); and identifying when, how, and to whom evaluation findings are to be disseminated with what purpose (e.g., Patton, 2008; Preskill & Torres, 1999a, 1999b). 1 Michigan State University, MI, USA Corresponding Author: Robin Lin Miller, Department of Psychology, Michigan State University, East Lansing, MI, 48824, USA Email: [email protected] 390 Miller 391 Taking theoretical prescriptions seriously should result in evaluations that are markedly different on multiple dimensions, including the consequences of doing evaluation in a particular fashion. By way of example, extending an idea developed by William Shadish and Nick Smith in course syllabi they each developed in the 1990s, I developed a class assignment in which graduate students in my program evaluation course had to adopt the role of a leading evaluation theorist. Choices of theorist included influential scholars such as Michael Scriven, Robert Stake, Michael Quinn Patton, Yvonna Lincoln, and Carol Weiss, among others. Over the course of the semester, each student had to conduct an evaluation of precisely the same evaluand following the prescriptions of their theorist closely. To prepare to design and conduct their evaluations, students relied primarily on their chosen theorists’ prescriptive writing about why and how evaluation ought to be done, rather than on reading reports of actual evaluations that may have been conducted by those theorists. That is, I pushed students to envision evaluations based on what theorists say one ought to do, rather than on what theorists may actually do. The evaluations that students produced each time I taught this course were strikingly different on matters such as questions posed, methods applied, information generated, perceived utility of the evaluation by program stakeholders, nature of judgments about the program, and so on. The students’ observations on how easy or difficult it was to follow the prescriptions and how clear the theorists were regarding the details of practice also varied widely. This simple exercise highlighted a point made by Smith and others: Sorting through theories and determining their ultimate feasibility and merit would benefit by close empirical examination of how evaluation theories can be and are applied in practice, whether they consistently and reliably lead to successful evaluation, and under what circumstances ‘‘good’’ evaluations are likely to emerge. As Smith notes, Theorists present and advocate theories largely in abstract conceptual terms, seldom in concrete terms based on how the theories would be applied in practice. We need to know how practitioners articulate or operationalize various models or theories, or whether, in fact, they actually do so. Indeed, it is not clear what is meant when an evaluator claims to be using a particular theoretical approach. (Smith, 1993, p. 240) Just as understanding practice can provide a basis for developing theory, Smith argues that studies of practice and the use of theory in practice can also form the building blocks for developing stronger evaluation theory (see also Shadish et al., 1991). Although the benefits to evaluating how theories perform in practice seem obvious, there have been few attempts to examine theories in this way. We also lack well-developed frameworks for considering how theories might be examined empirically. Prior attempts to evaluate whether and how evaluation theories are put to practice suggest an emergent framework for empirically exploring how theory informs practice and whether particular theories of practice yield better evaluations. In the remainder of this article, I will propose a set of criteria to guide research on evaluating the theory practice relationship. I will illustrate the value of each criterion by drawing upon published studies that have investigated theory practice congruence and divergence. Criterion 1: Operational Specificity For a theory to be used in practice, it must translate into clear guidance and sensitizing ideas for practitioners, and its theoretical signature must be recognizable. Specific guidance for practice may include the normative bases for and procedural guidelines regarding when, how, and what evaluation questions are identified and prioritized; who participates in each stage of the evaluation process; what role the evaluator assumes; what methods are ideal; how values underlying the theory are best 391 392 American Journal of Evaluation 31(3) enacted; and how plans for using the evaluation process and its results are considered. Empirical evaluation of theory, therefore, requires precise articulation of the implications for practice inherent in the theory, as well as the identification of operational ambiguities. A recognizable signature is necessary to support the claim that any particular theory adds value to practice. Shadish and Epstein (1987) surveyed a sample of members of the Evaluation Research Society and Evaluation Network to examine patterns in evaluators’ approach to their practice and perceived theoretical influences on those patterns. The questionnaire asked evaluators to report on characteristics of their training and work setting, details regarding the last evaluation that they conducted, and questions about the influence of 21 publications that the researchers perceived as seminal work authored by prominent theorists, such as Carol Weiss, Robert Stake, Michael Scriven, and Donald Campbell, on the respondents’ practice. Perhaps the most striking finding of the investigation was the overall low level of familiarity with most of the 21 publications, calling into question the degree to which they were indeed direct influences on practice. According to the study authors, a majority of respondents were unfamiliar with 71% of the seminal writings that appeared on the survey. Thus, among a sample of individuals who identify with evaluation enough to associate with professional evaluation societies, the theoretical work under investigation was not readily recognized by practitioners as having influence on their practice. Christie’s more recent study of Healthy Start evaluators in California (Christie, 2003) also found that practicing evaluators did not report explicit connections between select evaluation theories and their practice. Although these studies suggest that the link between evaluation theory and evaluation practice is tenuous, neither study examined in detail the various implications of specific evaluation theories for practice. The study of Shadish and Epstein (1987) indicates a connection between classes of theory, contexts of practice, and evaluation purposes. Shadish and Epstein identified four discernable patterns of practice—academically oriented, service-oriented, decision-oriented, and outcome-oriented, each of which could be predicted by training, work setting, and theoretical influences. These authors found, for example, that evaluators who were academically and outcome-oriented (e.g., those who favored selection of questions based on probable contribution to a substantive body of literature, focused on causal questions oriented toward academic publication of their evaluation findings) were most likely to perceive the work of Donald Campbell, Lee Cronbach, and Peter Rossi as influential. By contrast, service- and decision-oriented evaluators, those who oriented toward selecting questions to meet the informational needs of a client or for accountability purposes, reported Michael Scriven and Robert Stake as influential. Although the investigation stopped short of exploring indepth and in fine-grained fashion how evaluators used these ideas in practice and to what end, the investigation provides initial insight into how the operationalization of evaluation theory provides orienting guideposts for practice among the evaluators who reported familiarity with the identified works. Alkin and Christie (2004) use a tree metaphor to assign select theorists to one of three branches, valuing, methods, and use. The theorists whom Shadish and Epstein (1987) found were associated with the academically oriented outcome perspective are classified by Alkin and Christie on the branch of their theory tree associated with knowledge construction, the methods branch. By contrast, the theorists who were perceived as influential on service- and decision-oriented evaluators are classified on the valuing branch of the theory tree, a branch characterized by its theoretical emphasis on articulating evaluation’s and evaluators’ roles in determining the value of social programming. The classifications of Alkin and Christie reflect the importance of methods and values as separate orienting ideas for evaluators occupying distinct professional niches. Empirical research on theory practice connections indicates evaluators may affiliate with broad classes of evaluation theories. The research that has been conducted to date on the ways in which theories may inform practice has not considered the specific prescriptive elements of theories in 392 Miller 393 detail or examined whether and how these prescriptions inform evaluators’ practice decisions. Further empirical examination of theories in practice would be facilitated by careful assessment of how theoretical prescriptions may translate into practice guidance across the many activities, decisions, and roles that designing and conducting evaluations entail. Criterion 2: Range of Application The study of Shadish and Epstein (1987) highlights another criterion on which to consider and research theoretical prescriptions for practice. In their research, Shadish and Epstein found that particular theories appeared to have highest relevance to particular practice circumstances. That is, academicians were drawn to theories that placed a high priority on matters of method and the purpose of evaluation as contributing to a larger knowledge base whereas evaluators outside the academy were drawn to theories that attended more closely to issues such as determining merit and worth and decision-oriented use. No single theory may be ideally suited to every situation an evaluator may encounter. The empirical evaluation of theory must, therefore, consider the described limits of the theory’s application. What are the most suitable conditions for applying the theory? When the theory is applied under ideal circumstances, are the processes and outcomes similar to or different from those that occur when it is applied in circumstances that are less than ideal? How adaptable is theory across a range of conditions? Descriptively it is also important to identify under what practice circumstances and in pursuit of what evaluative questions any theory has been and can be applied. Williams (1989) used similarity ratings to develop taxonomy of evaluation practice. Fourteen theorists provided ratings regarding their theoretical similarity to each of the other theorists included in the study. The theorists also rated their practice along seven dimensions. Data were analyzed through multidimensional scaling techniques. Theorists differed along four principal dimensions: quantitative qualitative methodological preference, accountability versus policy orientation, client involvement versus noninvolvement, and conceptual use for unspecified users versus decisionoriented use for specific users. The map of theorists’ practice revealed only two dimensions: interpretive descriptive versus causal claims and specific versus general use. Williams concludes from this that theorists’ prescriptions are more complex than their practice. Importantly, however, she finds a set of theorists whose practice does not sit at the extremes of the practice dimensions and who may be considered flexible practitioners who adapt their evaluation practice to circumstances. Although she does not consider how adaptable their theories are, her findings point to the importance of considering how relevant any one theoretical approach may be when one encounters the realities of the field and the evaluator’s professional and disciplinary context. In one of the few studies that explicitly examined how a particular theory was used in practice, a study of empowerment evaluation, Miller and Campbell (2006) found that in a small number of cases evaluators reported a majority of stakeholders were disinterested in and had no time for the intensive engagement required of them to be part of an empowerment evaluation. The approach, which relies on stakeholders to engage in an iterative and collective process of taking stock of their goals, objectives, processes, and outcomes, could not be readily adapted to a situation in which setting members were disinclined toward its use and preferred that the evaluator conduct a different kind of evaluation. In these cases, the evaluators reported that they had to adopt kindred participatory approaches that required less of setting members to move the evaluations forward. Thus, empowerment evaluation’s relevance to settings in which staff cannot or will not dedicate their effort to an evaluation may be low and other models may be of greater relevance. Miller and Campbell also found that what they called a Socratic approach to empowerment evaluation occurred more often in single-site evaluations than in multisite evaluations and that the Socratic approach best reflected the principles articulated by empowerment evaluation theorists, again suggesting another 393 394 American Journal of Evaluation 31(3) boundary on the approach’s range of application. Both studies underscore the need for and benefits to examining the contingencies governing the range of application of particular evaluation theories and describing the limits to their application through empirical means. Criterion 3: Feasibility in Practice Many theories represent a set of ideals that may not be easily applied in practice. Evaluating theories of practice should include some assessment of how easy or difficult the prescriptions for practice and sensitizing ideas in the theory are to apply. Can an evaluator readily do what the theory requires of him or her? Smith (1985) examined all of the published case examples of the use of adversary and committee hearing procedures in evaluation. His review of case examples identified a range of challenges to using the approach including its high preparation costs, intensive management demands, and expertise requirements. Despite its appeal to democratic deliberative ideals, the infeasibility of the approach appears to have led to its rare use. Recent debates about the appropriate use of experimental designs in evaluation frequently note that these designs are difficult to do well. The technical, ethical, skill, and resource requirements associated with these designs have implications for the evaluation circumstances under which it is feasible to follow theoretical prescriptions that emphasize cause and effect questions and the use of experiments to address those questions (see, for instance, the edited collection of essays on credible evidence in evaluation by Donaldson, Christie, & Mark, 2009 and the edited collection of essays on fundamental and enduring issues in evaluation by Smith and Brandon, 2008). Many theorists agree that evaluation is not simply a technical activity. It is a political and social activity too. Many of my students, having completed the exercise I described earlier, have been surprised to find that being utilization-focused evaluators would be a great deal easier to do successfully if only they had the political savvy, seasoned expertise, and interpersonal gifts of Michael Patton! Emerging taxonomies of core evaluator competencies (e.g., Stevahn, King, Ghere, & Minnema, 2005) also reflect the need for evaluators to be more than technicians. Skolits and colleagues (2009) identify roles that an evaluator enacts over the course of conducting an evaluation. The roles that they identify include manager, negotiator, detective, diplomat, judge, reporter, learner, and researcher, among others. Theories vary in the degree to which they require and emphasize skills associated with each of these roles and also in the extent to which they call on the evaluator to enact few or many of these roles. Theories may also have implications for how roles are ideally enacted and role switches performed. For example, emerging theories of culturally competent evaluation place strong emphasis on the evaluator as a reflexive learner (e.g., Symonette, 2004). These theories expect evaluators be adept in acquiring informal knowledge about the cultural rules and perspectives in evaluation settings and in interrogating the self in relation to those with whom the evaluator interacts. Eisner’s connoisseurship approach (1991) requires the evaluator have expert authority on and a specialists’ eye for the programmatic substance. Other theories (e.g., Rossi et al., 2004) place great emphasis on formal knowledge of evaluation and the researcher role. Some theories may have limits to their feasibility because of the nature and combination of role demands placed on evaluators or may only be feasible for evaluators who possess particular combinations of skills and traits. Although technical and role aspects of evaluation models are only two examples of features that may influence any evaluation model’s feasibility, determining whether particular theoretical prescriptions are infeasible altogether or under particular circumstances is necessary to permit informed selection of approaches to practice. 394 Miller 395 Criterion 4: Discernable Impact Many theories are intended to achieve impacts that result from how the evaluation is conducted. For instance, theories have been offered to emphasize the value of evaluation to promote democratic dialogue among stakeholders (House & Howe, 1999), facilitate organizational learning (Baizerman, Compton, & Stockdill, 2002; Preskill, 1999a, 1999b; Sanders, 2003), transform social arrangements (Mertens, 2009; Vanderplaat, 1995), and improve evaluation influence (Patton, 2008; Wholey, 1983). A critical area of empirical assessment of theory concerns close examination of whether the use of a particular theory actually leads to the impacts that are expected and desired and whether unintended effects occur (see also Henry & Mark, 2003). A principal focus of the Miller and Campbell study on empowerment evaluation (2006) concerned documenting the evidence that empowerment evaluation processes led to empowered outcomes. That is, to what extent did using empowerment evaluation empower individuals and organizations or redress social injustice? Theoretically, if empowerment evaluation or any other approach is implemented as intended by its developers, there should be discernable benefits because of, due to, and linked to the approach itself. Indeed, many theories are justified on the basis of claims about the desirable outcomes produced by applying the particular approach. In their review, Miller and Campbell (2006) found only seven cases in which an author attempted to evaluate the evaluation itself through systematic data collection and provided results on the outcome of using the empowerment evaluation process. They found weak evidence for claims regarding the specific benefits of empowerment evaluation, though did find some evidence that engaging with the stakeholders and setting in a Socratic manner was associated with more reported benefits than using the approach in other ways. Amo and Cousins (2007) recently studied cases of process use in practice. By examining cases, they identified three broad categories of indicators of process use, learning, behavior, and attitude. They note, however, ‘‘The literature search conducted in the context of this study, although not exhaustive, shows a relative paucity of empirical studies examining the concept of process use directly or indirectly. Almost a decade after the concept was coined, there remains much opportunity to study, question, and substantiate process use.’’ (Amo & Cousins, 2007, p. 21). They go on to note the weakness of the empirical evidence regarding process use and the procedures for encouraging it. Criterion 5: Reproducibility An important component of determining the impact of evaluation theories is whether any impacts that are observed can be reproduced over time, occasions, and evaluators. It therefore becomes essential to know what diverse evaluators actually do when they claim to employ an approach, whether their implementation of that approach approximates the standards set for it, and whether the approach can achieve its intended outcomes in diverse evaluators’ hands. In the study of empowerment evaluation, Miller and Campbell (2006) found that despite its comparatively clear operational guidelines (e.g., broad community and stakeholder involvement in all aspects of the evaluation, a taking stock process, evaluator acting as coach), case reports indicated wide variation in how the approach was implemented. For instance, Miller and Campbell identified one case in which there was no stakeholder input in any aspect of the evaluation. In roughly a third of the cases, only staff members were involved in the evaluation and in limited ways, such as reviewing measures selected by the evaluator. Although it is reasonable to expect that those who use an approach may not follow it prescriptively, the wide variation Miller and Campbell found indicated that in a majority of cases evaluators had not reproduced a process that was recognizable as empowerment evaluation. Is this because the approach is difficult to reproduce? 395 396 American Journal of Evaluation 31(3) The data of Miller and Campbell (2006) are not adequate to address this question fully, but their data provide some evidence that the approach is reproducible. Miller and Campbell identified cases conducted by evaluators other than Drs. Fetterman and Wandersman who are the originators of empowerment evaluation, in which relatively close adherence to the principles and procedures Fetterman and Wandersman prescribe was evident. Some evaluators did reproduce the approach successfully. Because theories are to be used by evaluators other than their inventors, examination of whether evaluators can reproduce the approach and its outcomes are essential. Close examination of the reproducibility of theories may help to categorize theories regarding the degree to which they are primarily useful as sensitizing ideologies or sources of practical guidance on carrying out aspects of evaluation. Better evaluations, defined by the terms of the various theories, are a major reason for theories to have initially developed. In crafting their theories, theorists set out to improve evaluation and leverage its impact. Research on evaluation should investigate the impacts of following various approaches to practice, the extent to which these lead to the qualities of an evaluation for which theorists’ hope, and the degree to which these impacts may be produced consistently. Consequences of Criterion-Based Investigation of Evaluation Theory The evaluation profession must gain a better understanding of the requirements needed to concretize, measure, and test the effects of evaluation frameworks and procedures empirically. The criteria I have proposed here represent an initial step toward articulating what we might do to move forward research in this area to generate a solid descriptive account of the relationship between real-world practice and evaluation theory. The criteria I propose complement the recent general framework for research on evaluation practice proposed by Mark (2008). Mark identifies four general categories of inquiry: context, process, consequences, and professional issues. He proposes general subcategories of investigation in each area, such as researching the societal and organizational context in which evaluations occur and suggests potential modes of inquiry for research in each area. The criteria I propose here provide a specific way in which to engage Mark’s framework to include consideration of the practical merits of an evaluation theory across Mark’s inquiry domains. These criteria provide a point of departure for addressing a more specific set of research questions on evaluation theory on evaluation theory and building the empirical base to inform the refinement of contingent theories of practice (see Shadish, 1998; Shadish et al., 1991).1 Employing the framework I have proposed also facilitates posing questions about the theory practice relationship that are not yet well explored. Questions that emerge from this framework include those that consider how each of these features of a theory–operational clarity, range of application, implementation feasibility, evaluation impact, and reproducibility may relate to one another. For instance, how do impacts vary along dimensions of range of application? The framework may also assist evaluators to identify questions about how aspects of a theory may relate to the domains of factors identified by Mark. Do experts achieve higher reproducibility than novices, regardless of the operational clarity of these approaches? How does combining approaches affect the expected impacts of each approach? Are theories with particular features, such as being high on operational clarity or feasibility, more or less influential on and used in particular practice contexts? These criteria also provide a unique framework against which theories may be classified and one that may be of particular benefit to practitioners. By understanding that theories have optimal ranges of application, do or do not combine well with other approaches, or present challenging role demands, practitioners may be better able to sort through and discern how to take the benefits of a range of theories 396 Miller 397 to daily practice and theorists may be pushed to consider the practice components of their theories in new and more refined ways. Applying criteria such as I have described has important implications for what evaluators report in describing practice experiences and also for methodological criteria for selecting and assessing evaluations in which particular theories have been applied. Perhaps most obvious, cases must be described with adequate detail to allow others to study them. Details of importance would include clear statements of the evaluation setting, evaluation purpose, rationale for applying the theoretical approach, articulation of how the theory was enacted in the particular case, descriptions of all actors and their roles, a chronological event history of the evaluation, and information on what outcomes were expected to accrue from applying the approach and when and how these were substantiated. These criteria highlight the importance of acquiring multiple case examples in which evaluators believe that they have translated a theory to practice for an adequate sample of cases to be available for study. Additionally, they point to the need for prospective evaluation of evaluations to become a standard of practice. In the real world of practice, evaluators may not use any theory as more than a vague map to facilitate and guide their action. Evaluators may also use combinations of theories rather than apply any theory in pure fashion. And, the demands of the particular evaluation situation may predominate over theoretical prescriptions, preferences, and training. Yet, evaluators have invested heavily in theorizing how to maximize the success of evaluations. Systematic and rigorous research on these theories can provide essential information in the development of an evidence base for a theoretically rooted evaluation practice, as well as provide the evidentiary base for the development of practicebased theory. Note 1. These criteria are distinct in purpose from the criteria articulated for evaluating evaluations in so far as most meta-evaluation addresses whether evaluations meet practice standards such as the Joint Committee Standards. Meta-evaluations do not typically consider the quality of the theory that underlies the specific practice instance under consideration. Acknowledgments The author thanks Melvin M. Mark, Nicholas Smith, Karen E. Kirkhart, Rebecca Campbell, and Miles A. McNall for their helpful comments and suggestions on prior drafts of this essay. Declaration of Conflicting Interests The author(s) declared no conflicts of interest with respect to the authorship and/or publication of this article. Funding The author(s) received no financial support for the research and/or authorship of this article. References Abma, T. A., & Stake, R. E. (2001). Stake’s responsive evaluation: Core ideas and evolution. New Directions for Evaluation, 92, 7-22. Alkin, M. C., & Christie, C. A. (2004). An evaluation theory tree. In M. C. Alkin (Ed.), Evaluation roots: Tracing theorists’ views and influences. Thousand Oaks, CA: SAGE. Amo, C., & Cousins, J. B. (2007). Going through the process: An examination of the operazionalization of process use in empirical research on evaluation. New Directions for Evaluation, 116, 5-26. The art, craft, and science of evaluation capacity building. Baizerman, M., Compton, D. W., & Stockdill, S. H. (Eds.). (2002). New Directions for Evaluation, 93. 397 398 American Journal of Evaluation 31(3) Christie, C. A. (2003). What guides evaluation? A study of how evaluation practice maps on to evaluation theory. New Directions for Evaluation, 97, 7-36. Cousins, J. B., & Earl, L. M. (1992). The case for participatory evaluation. Educational Evaluation and Policy Analysis, 14, 397-418. Cousins, J. B., & Whitmore, E. (1998). Framing participatory evaluation. New Directions for Evaluation, 80, 5-23. Donaldson, S. I., Christie, C. A., & Mark, M. M. (Eds.). (2009). What counts as credible evidence in applied research and evaluation practice? Thousand Oaks, CA: SAGE. Eisner, E. (1991). The enlightened eye. New York, NY: MacMillan. Fetterman, D. (1994). Empowerment evaluation. Evaluation Practice, 15, 1-15. Greene, J. C. (2007). Mixed methods in social inquiry: research methods for the social sciences. San Francisco, CA: Jossey-Bass. Greene, J. C. (1997). Evaluation as advocacy. Evaluation Practice, 18, 25-35. Henry, G. T., & Mark, M. M. (2003). Toward an agenda for research on evaluation. New Directions for Evaluation, 97, 69-80. House, E. R., & Howe, K. R. (1999). Values in evaluation. Thousand Oaks, CA: SAGE. Mark, M. M. (2008). Building a better evidence for evaluation theory: Beyond general calls to a framework of types of research on evaluation. In N. L. Smith & P. Brandon (Eds.), Fundamental issues in evaluation (p. 111 134). New York, NY: The Guilford Press. Mark, M. M. (2003). Toward an integrative view of the theory and practice of program and policy evaluation. In S. I. Donaldson & M. Scriven (Eds.), Evaluating social program and problems: Visions for the new millennium (p. 183 204). Mahwah, NJ: Lawrence Erlbaum Associates. Mark, M. M. (2001). Evaluation’s future: Furor, futile, or fertile? American Journal of Evaluation, 22, 457-479. Mark, M. M., Henry, G. T., & Julnes, G. (2000). Evaluation: An integrated framework for understanding, guiding, and improving policies and programs. San Francisco, CA: Jossey Bass. Mark, M. M., & Shotland, R. L. (1985). Stakeholder-based evaluation and value judgments. Evaluation Review, 9, 605-626. Mertens, D. M. (2009). Transformative research and evaluation. New York, NY: The Guilford Press. Miller, R. L., & Campbell, R. (2006). Taking stock of empowerment evaluation: An empirical review. American Journal of Evaluation, 27, 296-319. Patton, M. Q. (2008). Utilization-focused evaluation (4th ed.). Thousand Oaks, CA: SAGE. Preskill, H., & Torres, R. T. (1999a). Building capacity for organizational learning through evaluative inquiry. Evaluation, 5, 42-60. Preskill, H., & Torress, R. T. (1999b). Evaluative inquiry for learning in organizations. Thousand Oaks, CA: SAGE. Rossi, P. H., Lipsey, M. W., & Freeman, H. E. (2004). Evaluation: A systematic approach (7th ed). Thousand Oaks, CA: SAGE. Ryan, K. E., & Schwandt, T. A. (Eds.). (2002). Exploring evaluator role and identity. Greenwich, CT: Information Age Publishing. Sanders, J. R. (2003). Mainstreaming evaluation. New Directions for Evaluation, 99, 3-6. Shadish, W. R. (1998). Evaluation theory is who we are. American Journal of Evaluation, 19, 1-19. Shadish, W. R., Cook, T. D., & Leviton, L. C. (1991). Foundations of program evaluation: Theories of practice. Newbury Park, CA: SAGE. Shadish, W. R., & Epstein, R. (1987). Patterns of program evaluation practice among members of the Evaluation Research Society and Evaluation Network. Evaluation Review, 11, 555-590. Skolits, G., Morrow, J., & Burr, E. (2009). Evaluator responses to evaluation activities and demands: A re-conceptualization of evaluator roles. American Journal of Evaluation, 30, 275-295. Smith, N. L. (2007). Empowerment evaluation as ideology. American Journal of Evaluation, 28, 169-178. 398 Miller 399 Smith, N. L. (1993). Improving evaluation theory through the empirical study of evaluation practice. Evaluation Practice, 14, 237-242. Smith, N. L. (1985). Adversary and committee hearings as evaluation models. Evaluation Review, 9, 735-750. Smith, N. L., & Brandon, P. (Eds.) (2008). Fundamental issues in evaluation. New York, NY: The Guilford Press. Stevahn, L., King, J. A., Ghere, G., & Minnema, J. (2005). Establishing essential competencies for program evaluators. American Journal of Evaluation, 26, 43-59. Symonette, H. (2004). Walking pathways toward becoming a culturally competent evaluator: Boundaries, borderlands, and border crossings. New Directions for Evaluation, 102, 95-109. Vanderplaat, M. (1995). Beyond technique: Issues in evaluating for empowerment. Evaluation, 1, 81-96. Wholey, J. (1983). Evaluation and effective public management. Boston, MA: Little John. Williams, J. E. (1989). A numerically developed taxonomy of evaluation theory and practice. Evaluation Review, 13, 18-31. Worthen, B. R. (2001). Whither evaluation? That all depends. American Journal of Evaluation, 22, 409-418. 399
© Copyright 2026 Paperzz