Job market paper: Daniel Hedblom Toward an Understanding of Corporate Social Responsibility: Theory and Field Experimental Evidence* Daniel Hedblom†, Brent R. Hickman, and John A. List The University of Chicago . This version: December 7, 2016 Download latest version: http://jmp.danielhedblom.com Corporate social responsibility (CSR) has been on the business landscape for more than half a century. Advocates of CSR argue that such activities can increase long-term profits and social well-being, while critics argue that CSR serves as a major distraction. Social scientists and business scholars are largely divided on the value of CSR, with most of the focus on the demand side. We develop a theory and a tightly-linked field experiment to explore the supply side impact of CSR. Our natural field experiment, in which we created our own firm and hired actual workers, generates a rich data set on worker behavior and responses to CSR incentives. We use these data to estimate a structural principal-agent model with three dimensions of worker heterogeneity: productivity, work quality, and value of time. This allows us to answer a variety of questions related to treatment effects (on existing employees’ behaviors) and selection effects on the pool of applicants from CSR. We find strong evidence that when a firm convinces its workers that their efforts make the world a better place (as opposed to just making money) it will attract workers that are more productive, produce higher quality work, and have more highly valued leisure time. We also find an economically significant treatment effect of CSR on improving work quality of existing employees as well. Our research design may serve as a framework for causal inference on more general forms of non-pecuniary incentives in the workplace. JEL Classifications: C14, J22, J30, M52 * We thank Stephane Bonhomme, Larry Schmidt, Steven Levitt, and Emir Kamenica for helpful comments. Special thanks to Joseph Seidel for excellent research assistance. The authors wish to thank seminar participants at The University of Chicago, University of Wisconsin, the Economic Science Association North American meeting, and the University of Chicago Advances with Field Experiments conference for helpful feedback. † Corresponding author. E-mail: [email protected]. Address: The University of Chicago, Department of Economics, 1126 East 59th Street, Chicago, Illinois 60637. 1 1 Introduction Economic historians commonly mark the Industrial Revolution as a major historical moment, as every aspect of life was permeated—Incomes and the population began to grow, and the standard of living began a consistent climb for the first time in history. While few would argue with such facts, the Industrial Revolution also marked a landmark change in social responsibility of business organizations. Whereas many scholars argue that corporate social responsibility (CSR) has its roots in the 1950s, the framework began to emerge in the 19th century. As Wren (2005) points out, with the new factory system in Great Britain came social unrest due to concerns surrounding poverty, slums, and child labor. Businesses soon responded with the industrial welfare movement, which included introduction of many worker perks (see pp. 269-270, Wren 2005). Whether such actions were profit motivated, socially motivated, or a mixture remains debated in the literature. Amongst scientists, the virtues of CSR have been debated at least since Friedman (1970) famously described CSR as a ”fundamentally subversive doctrine,” and declared that the only social responsibility of a firm is to maximize its profits.1 Nevertheless, today CSR is a common business practice, as most large firms have entire branches dedicated to socially responsible concepts and practices. Indeed, hundreds of millions annually are budgeted on such programs. One economic explanation for the prevalence of CSR is that making the world a better place is not necessarily at odds with profit maximization. For instance, proponents of CSR have argued that investments in CSR actually increase profits. There is, however, no consensus in the empirical literature on whether investments in CSR have a positive effect on the firm’s bottom line. While some studies report a positive effect (Waddock and Graves 1997), others find mixed, negative, or no effects (Barnett and Salomon 2012; Godfrey et al. 2009; Servaes and Tamayo 2013). A possible reason for these mixed findings is that with a few notable exceptions, empirical studies of CSR tend to focus primarily on the demand side of the market (Du et al. 2011; Elfenbein et al. 2012; McDonnell and King 2013; Servaes and Tamayo 2013). The demand side explanation posits that firms use CSR as a means to signal certain ethical or social values to consumers who have specific preferences. In other words, CSR can be seen as a form of marketing or product differentiation. While plausible, such efforts ignore the effect of CSR on the supply side of the market, or how one can utilize such signaling to impact the firm’s own employees. Under this view, investments in CSR provide non-pecuniary incentives to (i) encourage a greater variety, or number, of potential employees to seek employment at the firm, and once employed CSR can (ii) induce higher effort levels from employees, which increases labor productivity. Indeed, there are some recent studies of the supply side effects of CSR, but these tend to focus on employee retention or wage requirements (see Carnahan et al. 2015; Burbano 2016). In this study, we use a natural field experiment to identify important features on the supply side of a structural econometric model. The central premise is that workers respond heterogeneously to CSR, allowing us to estimate important behavioral parameters. Workers are willing to supply labor, 1 It should be noted that Friedman (1970) added some important qualifiers to the end of his statement about CSR. In particular, the only social responsibility of the firm is the maximize its profits as long as it ”stays within the rules of the game” and ”avoids deception or fraud”. 2 produce output, and pay attention to quality based, in part, on their social preferences. Under such a model, investments in CSR have an effect on the intensive margin. Yet, heterogeneous preferences for CSR also cause workers to differ in their propensity to select into different jobs. A worker who prefers to work for a CSR firm will be more likely to apply for a job within this firm than with a non-CSR firm, ceteris paribus, inducing an effect of CSR on the extensive margin. Our natural field experiment permits an estimation of effects along both the intensive and extensive margins. We begin the analysis by presenting the structural model. The model provides direction into the exogenous variation necessary to quantify unobservable worker characteristics, such as motivation and preferences. Our model draws upon the insights of Torgovitsky (2015) and D’Haultfoeuille and Février (2015). The identification argument consists of deriving two mappings between types of agents, wages, and hours worked. Once these mappings have been obtained, it is possible to retrieve the worker cost function and the distribution of cost types in the population. We extend this line of work in several directions. First, agents are characterized by a three-dimensional vector of parameters representing leisure preference, productivity, and accuracy. Second, since each agent chooses the utility-maximizing number of hours each workday, we introduce individual and day-specific shocks to the time budget constraint, and individual, day, and unitspecific shocks to productivity and quality. This allows for more flexibility and prevents the model from being rejected if an agent is observed working a different number of hours during different days. To provide the necessary experimental control and variation to estimate the model, we started our own consultancy firm, HHL Solutions, LLC and used it as a vehicle for conducting the field experiment. The stated mission of the firm is to provide data collection services for various clients. Every aspect of the firm’s origination and organization is authentic. By starting our own firm, we are not only able expose subjects to controlled variation in wages and information about the firm’s investments in CSR, but we also achieve a high degree of control over the hiring process and the work environment. Since subjects are recruited from an actual marketplace, we also improve upon the external validity by avoiding sample selection problems than can occur when subjects are recruited from a very specific population (Levitt and List 2007). Our field experiment consists of randomized treatments administered in two stages: (1) a hiring stage and (2) a real effort work task stage. Both stages generate data on worker behavior when faced with different wage contracts and non-pecuniary incentives in the form of information about the firm’s investments in CSR. The field experiment is tightly linked to the theoretical model to ensure the appropriate instruments and to produce sufficient variation in the data such that the parameters of interest can be accurately estimated. In the hiring stage of the experiment, we recruit subjects via job advertisements placed on online marketplaces for jobs in 12 cities throughout the US. We vary the offered wage rate and how much information is revealed about our firm’s involvement in CSR. Application rates in each treatment group is the first outcome variable. This allows us to investigate how pecuniary incentives (the different wage rates) and non-pecuniary incentives (information about CSR) affect labor market decisions on the extensive margin. In particular, we are able to estimate whether non-pecuniary incentives in the form of CSR information increase the probability that a given individual applies for the job. The hiring 3 stage serves one more important feature: to provide an instrument for the selection effect of CSR on productivity, accuracy, and leisure preference. In the second stage of the experiment, subjects who applied for the job proceed to conduct a real effort work task through a custom-designed online task system. The task system provides us with detailed information about the performance of each employee and allows us to construct measures of labor supply, productivity (both in terms of output per hour and the amount of time it takes to produce a given unit of output), and output quality. Treatments in the work stage vary information about the client associated with each day’s work. This allows us to estimate the intensive margin by turning on and off CSR incentives by denoting the work as either ”CSR work” or not mentioning CSR. Importantly, both those who are recruited for a CSR job and those who are recruited without information about CSR eventually work for CSR and non-CSR clients once employment commences. Several reduced form insights emerge. First, as expected, higher wages increase application rates; an increase of wages from $11 per hour to $15 per hour increases application rates by about 33 percent. Interestingly, advertising the firm as a CSR firm increases the application probability by roughly the same amount as an increase of hourly wages from $11 to $15. In this way, both wages and CSR have important effects on the extensive margin. Due to heterogeneity in preferences, individuals who are more likely to respond to CSR may perform differently while working on the actual job. Since we observe the behavior in the first stage for all subjects who participates in the second stage, we are able to test whether this is the case. A sub-result under this first finding is that men appear to be more price sensitive than women, and also more elastic to CSR appeals. The former insight consonant with the literature on charitable fundraising (List 2011). The latter insight is, however, at odds with the literature that suggests women are more responsive to social cues than men (Croson and Gneezy 2009). Second, once employed, again both wages and CSR incentives matter. For example, workers paid higher wages log more hours. And, at each wage level those working for CSR firms work more hours. Importantly, the initial selection of workers affects ultimate productivity, especially for women. For instance, those who we recruit to work for a CSR firm are much more sensitive to working for a CSR firm than those who are recruited with no mention of CSR. In terms of output per hour, women respond very strongly to CSR incentives whereas men have a statistically insignificant response. This result is consonant with the insights from the literature (Croson and Gneezy 2009). Third, there is no concomitant decrease in accuracy associated with CSR type women. Even though they produce more per hour, their production accuracy does not suffer. Yet, for men, their product quality actually increases when employed by a CSR firm. Together, these insights suggest that CSR draws out higher output from women and higher quality from men. In order to complement the reduced-form analysis, we use the data generated by the experiment to estimate the structural model. This allows for an in-depth analysis of a wide array of questions. In particular, how the impact on labor supply, productivity, and accuracy channels through selection and treatment effects. This is an important feature of our methodology. Structural economic techniques use statistical methods in combination with a theoretical model of worker behavior answer questions about unobservable variables. Given this close connection between the model and the empirical method, the 4 experiment is designed with the model in mind in order to establish an ideal data-generating process for estimating the components of the model. As a consequence, the model yields results that cannot be derived directly from the data and is particularly useful for generalizing the results outside the experimental setting. We obtain a wealth of insights from the structural results. The baseline distribution of productivity is comparable between male and female workers. Moreover, there does not seem to be any significant treatment effects on productivity for either male or female workers. Hiring stage CSR incentives have a significant impact on female worker productivity, however; for a given level of on-the-job experience, women who select into the job under CSR incentives are about 40 percent more productive that those who select in with neutral information. For accuracy, we find that women by baseline are more accurate than men. But both selection and treatment effects induce men to increase their accuracy. This causes a convergence in accuracy patterns between male and female workers. This study is related to several strands of literature. To the best of our knowledge, the only previous studies using field experiments to investigate the impact of CSR on labor market behavior are the ones of Burbano (2015, 2016). Through field experiments conducted via mTurk and Elance, Burbano (2016) estimates the reduced-form effects of CSR messages on labor output and salary requirements. The results show that CSR adds value to the firm both through increased output and by lowering employee salary requirements. A particularly interesting result from this study is that higher performing workers appears to also be the ones with the greatest compensating differential. These workers are willing to give a larger share of the wage to work for a firm that invests in CSR. Burbano (2015) uses a field experiment to explore the effect of CSR on virtual workers, and find that CSR increases virtual worker’s willingness to do extra, unrequired work. This paper also contributes to the broader empirical literature on the effects of CSR on firm performance (Greening and Turban 2000). The question has primarily been addressed within the management literature. We contribute with an econometric methodology for investigating the supply side effects of CSR. We also contribute to the relatively new strand of literature within economics which uses experiments to generate data for structural models. Previous work has employed field experiments for structural analyses of, for example, charitable giving (DellaVigna et al. 2012), voting behavior (DellaVigna et al. 2015), and disappointment aversion (Gill and Prowse 2012). We contribute to this literature by providing an in-depth labor market application. Finally, this study contributes to the literature on equalizing differences (Roback 1982; Rosen 1986). Our structural analysis of selection effects constitutes a new way of estimating compensating differentials or equalizing differences across jobs that vary non-pecuniary incentives. The methodology presented in this paper has large applicability in areas beyond the study of CSR. One could readily use an almost identical method to estimate the effect of virtually any workplace characteristic over which there are heterogeneous worker preferences. For example, flexible hours and workplace competition. The rest of the paper proceeds as follows. Section 2 outlines the theoretical model and discusses its identification. The identification argument provides a number of insights regarding how the ex- 5 periment must be designed to achieve the proper variation in the data. Section 3 describes the field experimental design explains how it enables us to identify the parameters of the model. Section 4 presents the reduced form results from the field experiment. This includes both the results at the extensive margin and those at the intensive margin. Section 5 presents the results from the structural model. Finally, section 6 concludes the paper. 2 The Model This section presents the model and its identification. The identification argument will reveal what kind of variation we need to observe in the data in order to identify the parameters of interest. Since theory guides the the experimental design, we will postpone the discussion of the natural field experiment until the next section of the paper. We begin by outlining a simplified version of the model. The purpose is to explain how the results of D’Haultfoeuille and Février (2015) and Torgovitsky (2015) apply to the problem at hand. In the simplified model, agent heterogeneity is represented by a one-dimensional cost parameter and the only observed output is labor supply. The simple model does not include non-pecuniary CSR incentives. Instead, we confine the discussion to how exogenous variation in wages allows us to identify the cost function and the distribution of the individual cost parameter. We then extend the model to account for individual heterogeneity in three dimensions: leisure preference, productivity, and accuracy. We also introduce day and individual-specific shocks to these three parameters. This allows for more flexibility and prevents the model from being rejected when we observe a given individual working a different number of hours during different days. Finally, when agents are observed during several time periods, we may occasionally observe zero output. This is simply due to the fact that some agents, given their time constraint, does not find it desirable to work every single day. We therefore adapt the model to unbalanced panel situations with missing data. 2.1 Identification in The Baseline Model Suppose that i = 1, 2, ..., I agents are randomly assigned to a wage contract with wage rate w i ∈ {w 1 , w 2 } where w 2 > w 1 . The cost of working h hours is given by the function C (h; θi ) = θi c k (h) where θi ∼ Fk , k ∈ {1, 2} is an idiosyncratic cost-shifting parameter representing agent i ’s shadow value of time. The distribution Fk is unknown and to be estimated. A greater θi means that the alternative cost of working is higher and that we expect the agent to work fewer hours than agent with a lower θi if they are paid the same hourly wage rate. We may think of θi as representing individual i ’s leisure preference. The homogenous part of the cost function c k (h) is assumed to be increasing and strictly convex: c ′ (h) > 0, c ′′ (h) > 0. Each agent is assumed to choose a number of hours h ∗ so as to maximize utility, defined by income net of cost: h ∗ = arg max {w i h − θi c k (h)} h≥0 6 Since the cost function is increasing and convex, there exists a unique solution satisfying the FOC: w i − θi c ′ (h ∗ ) = 0 Note that the expression θi (h) = wi c ′ (h) is decreasing and monotone since c ′ (h) > 0. Our goal is to identify the distribution Fk and the cost function c(h) given the empirical distributions of hours. To do so, we start by retrieving a function that maps between hours and cost types for each wage contract. Consider the limit in which I → ∞ and we observe output generated by every possible θ . Let θ (k) (h) denote the cost type that would induce h hours under wage k . The following two exclusion restrictions are necessary for identification: F1 = F2 = F (1) c 1 (h) = c 2 (h) = c(h) (2) Restriction (1) says that agent cost types are drawn from the same distribution and do not depend on the wage. This assumption is motivated if there is no selection of agent types into wage contracts. The second exclusion restriction (2) says the homogenous part of the cost function is the same across wage contracts. This is motivated if task is uniform across contracts. Note that both assumptions are easily motivated by proper experimental design; with controlled randomization of subjects to wage treatments there will be no selection. Similarly, with a sufficient degree of control over the experimental setting, it is possible to design a task that is uniform across treatments. Suppose agent i is assigned to wage contract w 1 and work hi hours. This implies that θi satisfies ′ θi c (h i ) = w 1 . Now, suppose another agent j works the same number of hours h j = h i = h under wage contract w 2 . This implies that θ j > θi since j requires a higher wage rate to work the same number of hours as i . By using the FOC, we can derive a relationship between the cost parameters associated with i and j . In particular: θi = w1 , c ′ (h) θj = w2 c ′ (h) θ2 (h) = ⇒ θj = w2 θ1 (h) w1 w2 θi w1 ⇒ (3) Equation (3) provides a mapping between cost types who work the same number of hours across wage contracts. Suppose that we instead fix the cost type and ask how many hours an agent working under wage contract w 1 would work if assigned to the contrafactual wage contract w 2 . To answer this question, we use the fact that an agent’s observed number of hours is monotonic in θi . For example, the agent with the median θi is going to work the median number of hours under either wage contract. Hence, if agent i with cost type θi works hi hours under wage contract w 1 , and hi is the median in the empirical 7 distribution of hours among the agents working under w 1 , then we conclude that i would also work the median number of hours in the empirical distribution for w 2 , had the agent been assigned to w 2 . In particular, let G k denote the empirical distribution of hours for k ∈ {1, 2}. Since F (θ) = 1−G 1 (h) = 1−G 2 (h), we have [ ] h (2) (θ) = G 1−1 G 1 (h (1) (θ)) (4) Once the mapping (3) between cost types for a fixed number of hours and the mapping (4) between hours for a given cost type have been established, it is possible to obtain a sequence of points on the graphs of θ (1) (h) and θ (2) (h). To do so, we begin by normalizing a given θ ∗ to use as a starting point. For some (θ ∗ , h ∗ ) ∈ [θ, θ], let θ A (h ∗ ) = θ ∗ . This amounts to fixing total cost units since C (h; θ) = θc(h) ⇔ C̃ (h; θ) = ιθc(h) , ι ι ∈ R+ By the argument of D’Haultfoeuille and Février (2015) and Torgovitsky (2015), given (θ ∗ , h ∗ ), (3) and (4) identify θ (k) (h), k ∈ {1, 2} if there is a crossing point between θ (1) (h) and, θ (2) (h). Otherwise, θ (k) (h), k ∈ {1, 2} is partially identified. Thus θ (1) (h) and θ (2) (h) can be obtained. By using θ (1) (h) and the FOC, we can then retrieve c ′ (h) via G 1 . 2.2 Identification of The Extended Model We now extend the model to include individual heterogeneity in leisure preference, productivity and accuracy. Individual heterogeneity is thus described by a tuple (θL , θP , θ A ) where θL is the leisure preference parameter from the previous section, θP is a productivity parameter in individual i ’s production function, and θ A is a parameter governing the accuracy with which individual i produces output. All three parameters are exposed to treatment and selection effects due to CSR incentives. These effects, along with the distribution of the parameters, are unobserved and to be estimated. To fix ideas, consider a two-stage process in which a firm (1) hires workers and (2) lets them work during a given period of time. The firm can incentivize its workers through the wage rate and through a non-pecuniary incentive in the form of information about the firm’s investments in CSR. These incentives can be implemented both in a hiring stage and in a work stage. In the hiring stage, each worker is offered a given wage rate and may or may not be exposed to the non-pecuniary CSR incentive. We may think of this in terms of whether the firm openly advertises its CSR efforts when recruiting workers. The primary question of interest is how information about the firm’s investments in CSR attract workers who are inherently different in terms of productivity, accuracy, and leisure preference than those in the general population. We will refer to this effect as the selection effect of CSR. Individuals who proceed to work for the firm are paid whatever hourly wage rate they were offered in the hiring stage. Some of the individuals who begin to work for the firm were exposed to CSR incentives in the hiring stage while others were not. Information about CSR may, however, also be introduced any time during the work stage. This affects decisions at the intensive margin. One may think of this in terms of the firm having employees working on different ”projects” where some projects 8 are associated with CSR and some are not. We will refer to the effect of work stage CSR incentives on productivity, accuracy, and labor supply as the treatment effect of CSR. Of course, the treatment effect may differ across subject who selected into the job under CSR incentives and those who selected in under monetary incentives only. When we observe agents working for several days, we may encounter a situation in which the simple model described in the previous sub-section can be rejected. Since θ j i does not depend on t for j ∈ {L, P, A}, each agent is predicted to work the same number of hours each day. The model would hence be rejected if we were to observe an agent working a different number of hours during two different periods. Moreover, the simple model does not account for unbalanced panel issues arising in cases where agents only work during some of the possible days. In order to allow for more flexibility, we let productivity, accuracy, and the agent’s time constraint be exposed to period-specific shocks. We will now discuss the identification of the parameters associated with each agent’s leisure preference, productivity, and accuracy type. In each case, we will separate the treatment effect from the selection effect. 2.3 Leisure Preference An agent’s leisure preference type affects how much labor the agent is willing to supply. This includes the decision of whether to work at all during a given day. Agent i ’s period t leisure preference type is assumed to be affected by CSR incentives in a multiplicative fashion: F X X F θLi t = θLi βL00i (βFL0 ) X 0i D i βL11i t (βFL1 ) X 1i t D i u iLt (5) In this expression, θLi is the permanent, average leisure preference type for individual i . We assume that θLi follows a distribution θLi ∼ FL (θLi , αL ) where αL , is a vector of parameters to be estimated. The variables X 0i and X 1i are indicators for CSR incentives in the hiring and work task stage. D iF is a dummy variable for whether individual i is female. The parameters βL0 and βL1 are selection and treatment effects, and the parameters βFL0 and βFL1 are the female interaction selection and treatment effects. Note that u iLt is a shock to the shadow value of time. The shock accounts for the fact that people may have other things going on in their lives affecting how much time they are willing to spend working on the task at hand. By introducing a shock to the time budget constraint, we let the model be flexible enough so as to allow for cases in which a given agent works a different number of hours during different days. Let w i denote the wage assigned to agent i and let hi t denote the number of hours worked by agent i in period t . Each agent is assume to choose a number of hours h i t so as to maximize utility given by income net of the utility cost function: h ∗ = arg max{w i h −C i t (h; θLi t , αc )} h≥0 The cost function is assumed to be separable into a idiosyncratic and a homogenous part: C i t (h; θLi t , αc ) = θLi t c(h i t ; αc ) 9 (6) The idiosyncratic part is the leisure preference parameter already defined above. The homogenous part is assumed to depend on a parameter vector αc . We assume that c(h; αc ) is stictly increasing and strictly convex: c ′ (·; αc ) > 0, c ′′ (·; αc ) > 0. Note that this includes the special case of c ′ (0; αc ) > 0. For most of what follows, we will suppress the dependency on the parameter vector αc whenever motivated and simply write c(h). For any given agent i and period t , one of two mutually exclusive cases can occur: we observe a positive number of hours worked hi t > 0 or we observe zero hours worked hi t = 0. The FOC associated with (6) will therefore satisfy w i ≤ θLi t c ′ (h i t ), i = 1, 2, ..., I , t = 1, 2, ..., T The first case will help us identify the treatment effect while the second case will help us identify the selection effect. We will treat each of these two cases separately. But before we proceed, we need to establish identification of the cost function. 2.3.1 Identification of the Cost Function In order to identify the cost function, we begin by limiting our focus to the subset of the sample for which hi t > 0. Every individual-day observation is treated as if it were a unique person, in a setting (k) with no shocks, having fixed cost type θ̃m , m = 1, 2, ..., M , were w k , k ∈ {1, 0} denotes the two different wage rates and the number of day-individual combinations with hi t > 0 is M . Intuitively, θ̃ represents all factors influencing a person’s leisure shadow value on a given day, including treatments and shocks, but for the present stage all these terms are considered as one whole. The FOC implies (k) k (h m ) = θ̃m wk k ′ c k (h m ; αc ) and the exclusion restrictions are Assumption 1. The type distributions are invariant to wage treatment: FL(k) = FL , ∀k . Assumption 2. The cost functions are invariant to wage treatment: c k = c, ∀k . Again, these assumptions are motivated if the experimental design utilizes random assignment of wages to subjects and the task is uniform across treatments. The equilibrium mapping between hours worked across wage contracts for a given type can now be rewritten as h 1 (˜(θ)) = G 1−1 [G 2 (h(θ̃))] (7) where G k is the distribution of hours worked under w k . Similarly, from the FOC, we obtain θ̃1 (h) = w1 θ̃(h) w2 (8) We now normalize the median of G 2 (h) so that θ̃2 (G −1 (0.5)) ≡ 1. It is possible to trace out (7) and (8) from the normalized point and obtain θ̃1 (h) and θ̃2 (h). By assumptions 1 and 2, we can apply the results of 10 D’Haultfoeuille and Février (2015) and Torgovitsky (2015) and use the FOC for k = 2, c ′ (h) = w2 θ̃2 (h) to identify the parameters of c(h; αc ). 2.3.2 Identification of Leisure Preference Treatment Effects As the cost function has been retrieved, we can identify the treatment effect parameters. To do so, we begin by assuming hi t > 0 so that agent i is actually observed working during period t . Then the FOC holds with equality w i = θiLt c ′ (h i t ) (9) ln(w i ) − ln(c ′ (h i t )) − ln(θLi ) − X 0i ln(βL0 ) − X 0i D iF ln(βFL0 ) − X Li t ln(βLi ) − X Li t D iF ln(βFli ) = εLit (10) Taking logs of (9) gives us and ( Yi t ≡ ln ) wi = εLit + ln(θLi ) + X 0i ln(βL0 ) + X 0i D iF ln(βFL0 ) + X Li t ln(βLi ) + X Li t D iF ln(βFli ) c ′ (h i t ) Let εCL i = εLN i = ∑T εL 1(h >0 & X =1) 1i t ∑t =1T i t i t if 0 otherwise t =1 1(h i t >0 & X 1i t =1) ∑T t =1 1(h i t ∑T εL 1(h >0 & X =0) 1i t ∑t =1T i t i t if otherwise t =1 1(h i t >0 & X 1i t =0) 0 ∑T t =1 1(h i t > 0 & X 1i t = 1) > 0 > 0 & X 1i t = 0) > 0 Here we use the subscripts C and N to denote cases where there is a CSR incentive ( X 1i t = 1) and where there is no such incentive ( X 1i t = 0). The selected distribution of εLit conditional on hi t is the same regardless of treatment status since selection on the extensive margin only depends on X 0i for a given θLi . It follows that E [εCL i ] = E [εLN i ] Now, let Y Ci = Y Ni = (11) ∑T ( w i ) ln c ′ (h ) 1(h i t >0 & X 1i t =1) t =1 ∑ it if 0 otherwise T t =1 1(h i t >0 & X 1i t =1) ∑T t =1 1(h i t ∑T ( wi ) ln c ′ (h ) 1(h i t >0 & X 1i t =0) t =1 ∑ it if 0 otherwise T t =1 1(h i t >0 & X 1i t =0) ∑T t =1 1(h i t > 0 & X 1i t = 1) > 0 > 0 & X 1i t = 0) > 0 and note that whenever C≡ T ∑ t =1 1(h i t > 0 & X 1i t = 1) > 0 & 11 N≡ T ∑ t =1 1(h i t > 0 & X 1i t = 0) > 0 and ε̃Li ≡ εC i − εN i , we have Y C i − Y N i = ln(βL1 ) + D iF ln(βFL1 ) + ε̃i where E [ε̃Li ] = 0 E [D iF ε̃Li ] = 0 F For shorthand notation, let bL1 = ln(βL1 ) and bL1 = ln(βFL1 ). The baseline treatment effect βL1 and the female-specific treatment effect βFL1 are identifiable and estimable via ] [ β̂L1 β̂FL1 2.3.3 { [ = exp arg min[bL1 ,b F L1 ] I ( ∑ ′ i =1 Y Ci − Y F F N i − b Li − b L1 D i )2 ( ) 1 C > 0, N > 0 }] (12) Leisure Preference Selection Effects We now turn to the selection effect on leisure preference. When hi t = 0, we have w i < θLi t c ′ (0) and ) wi − ln(θL ) − X 0i ln(βL0 ) − X 0i D iF ln(βFL0 ) < εLit ln ′ c (0) ( (13) Let e i denote the LHS of (13) as it is the empirical analogue to εLit in the case of hi t = 0. We also define ρ i as the probability of observing zero hours from individual i during a given day t : ρ i = Pr[h i t = 0] = E [1(h i t = 0)] (14) Then we have ρ i = Pr[e i < εLit ] = 1 − Φ(e i ) = Φ(−e i ) and ( ) wi Φ (ρ i ) = − ln ′ + ln(θLi ) + X 0i ln(βL0 ) + X 0i D iF ln(βFL0 ) c (0) −1 Now, let (15) ∑I ηlr j = F i =1 ln(θLi ) × 1[(w i = j ) & D i = 1(l = F ) & X 0i = 1(r = C )] , ∑I F i =1 1[(w i = j ) & D i = 1(l = F ) & X 0i = 1(r = C )] ∑I ωlr j = i =1 Φ −1 ∑I (ρ i ) × 1[(w i = j ) & D iF = 1(l = F ) & X 0i = 1(r = C )] i =1 1[(w i = j ) & D iF = 1(l = F ) & X 0i = 1(r = C )] j ∈ {w 1 , w 2 }, r ∈ {C , N }, l ∈ {M , F } From the random assignment of individuals to wage contracts, and from the exclusion restriction E [θL |X 0 ] = E [θL ] 12 it follows that the selection effect parameters βL0 , βFL0 are identified via [ [ ] ] M M E ωCMw 2 − ωM N w 2 = E ηC w 2 − ηN w 2 + ln(βL0 ) = ln(βL0 ) (16) [ ] [ ] M M E ωCMw 1 − ωM = E η − η + ln(β ) = ln(βL0 ) L0 N w1 C w1 N w1 (17) [ ] [ ] E ωCF w 2 − ωFN w 2 = E ηCF w 2 − ηFN w 2 + ln(βL0 ) + ln(βFL0 ) = ln(βL0 ) + ln(βFL0 ) (18) [ ] [ ] E ωCF w 1 − ωFN w 1 = E ηCF w 1 − ηFN w 1 + ln(βL0 ) + ln(βFL0 ) = ln(βL0 ) + ln(βFL0 ) (19) Finally, we need to establish identification of the individual fixed effects, or in other words, the sequence {θLi }iI =1 . But with c(·; α̂c ), β̂L1 , β̂FL1 , β̂L0 , and β̂FL0 known, {θLi }iI =1 . If hi t > 0, ∀i , t then {θLi }iI =1 could be retrieved directly from (10). 2.4 Productivity Productivity determines the amount of time it takes to produce a unit of output. Let θP i t denote individual i ’s period t productivity type. Assume that θP i t is related to i ’s baseline productivity type θP i in a multiplicative fashion: F X F X θP i t = θP i βP 11i t (βFP 1 ) X 1i t D i βP 00i (βFP 0 ) X 0i D i (20) Also assume that θP i ∼ FP (θP i , αL ) where αL is a vector of parameters to be estimated. The parameters βP 0 and βP 1 represent the treatment and selection effects of the CSR incentives on productivity. The two additional parameters βFP 0 and βFP 1 account for gender differences in treatment and selection effects. Let Q i t be the total number of units produced by i up until the end of period t so that Q i T is the total number of units produced by i during the whole work period. The q th unit produced by i is simply denoted by qi so that qi = 1, 2, ...Q i T . 2.4.1 Productivity Treatment Effects The total time required for all output through unit q is given by production technology τt defined by ( τt (q; θP i t ) = q θP i t )δ ũ qP (21) Taking logs of the production technology gives us ln(τqi ) = δ ln(q i )−δ ln(θP i )−δ ln(βP 1 )X 1qi X 1qi −δ ln(βFP 1 )X 1qi D iF −δ ln(βP 0 )X 0i −δ ln(βFP 0 )X 0i D iF + ε̃Pqi (22) Consider two subsequent units of output qi − 1 and qi and use (22) to form the log difference in time required to produce each of them: ( ln τq i τqi −1 ) ( ) qi = δ ln − δ ln(βP 1 )(X 1qi − X 1(qi −1) ) − δ ln(βP 1 )(X 1qi − X 1(qi −1) )D iF + εPqi qi − 1 13 (23) Since E [εPqi ] = 0, E [εPqi q i ] = 0, E [εPqi X 1qi ] = 0, E [εPqi D iF ] = 0 the parameters of (23) are identified. Now, define P ε̃C i = P ε̃N i Y P ji = = ∑Q T ε̃ 1(X =1) qi =1 qi 1qi if 0 otherwise ∑Q i T q i =1 1(X 1q i =1) ∑Q T ε̃ 1(X =0) qi =1 qi 1qi if ∑Q i T qi 0 =1 1(X 1q i =1) ∑Q i T q i =1 1(X 1q i ∑Q i T q i =0 1(X 1q i = 0) > 0 = 0) > 0 otherwise ∑Q T qi =1 [ln(τqi )−δ ln(qi )]1[X 1qi =1( j =C )] if 0 otherwise ∑Q T q i =1 1[X 1q i =1( j =C )] ∑Q T q i =1 1[X 1q i = 1( j = C )] j = C,N Since the error ε̃Pqi is independent of treatment status, it follows that whenever QT ∑ q i =1 then 1(X 1qi = 1) > 0 P & QT ∑ q i =1 1(X 1qi = 0) > 0 P Y Ni − Y Ci δ P = ln(βP 1 ) + ln(βFP 1 )D iF + ε̃i where P E [ε̃i ] = 0, P E [ε̃i D iF ] = 0 Hence, the treatment effects are identifiable and estimable via [ β̂P 1 β̂FP 1 ] [ { = exp arg min[bP 1 ,b F P1] I ( ∑ ′ i =1 P Y Ci −Y P F F N i − bP i − bP 1 D i )2 ( 1 Q it ∑ q i =1 1(X 1qi = 1) > 0 & Q it ∑ q i =1 )}] 1(X 1qi = 0) > 0 (24) The learning parameter δ is estimated by ) ( ))2 ( ( τq i qi δ̂ = arg min − δ ln 1(X 1qi = X i (qi −1) ) ln δ i =1 q =2 τqi −1 qi − 1 i Qi I ∑ ∑ 2.4.2 Productivity Selection Effects For the productivity selection effects, let ∑I ωPl r = i =1 ∑Q i T F F F q i =0 [ln(τq i ) − δ ln(q i ) − δ ln(βP 1 ) − δ ln(βP i X 1q i D i )] × 1[D i = 1(l ∑I ∑Q i T F i =1 q i =1 1[D i = 1(l = F ) & X 0i = 1(r = C )] 14 = F ) & X 0i = 1(r = C )] l = M,F r = C,N ∑I ηPl r = F i =1 ln(θP i ) × 1[D i = 1(l = F ) & X 0i = 1(r = C )] ∑I F i =1 1[D i = 1(l = F ) & X 0i = 1(r = C )] l = M,F r = C,N From random assignment of individuals to wage contracts, and since E [θP i |X 0i , D iF ] = E [θP i ] it follows that Pl E [ωPNM − ωCP M ] = E [(ηPl N − ηC )δ + ln(βP 0 )δ] = δ ln(βP 0 ) and E [ωPNF − ωCP F ] = E [(ηPNF − ηCP F )δ + (ln(βP 0 ) + ln(βFP 0 ))δ] = δ(ln(βP 0 ) + ln(βFP 0 )) Hence, βP 0 and βFP 0 are identified and estimable via [ β̂P 0 = exp [ β̂FP 0 = exp ωPNM − ωCP M ] δ (ωPNF − ωCP F ) − (ωPNM − ωCP M ) ] δ 2.5 Accuracy Accuracy measures whether or not unit q in period t was correctly entered by i . For our purposes, higher accuracy directly translates to greater output quality.2 To fix ideas, we define a latent accuracy ∗ index a qi as ∗ a qi = Θ Aqi + εqi (25) where εqi follows a known symmetric distribution Φ(·). The accuracy parameter contains both permanent characteristics and treatment effects Θ Aqi = Θ Ai + Tqi where the fixed component is Θ Aqi = θ Ai + β A0 X 0i + βFA0 X 0i D iF 2 In this subsection, we outline a version of the accuracy parameter identification without a learning effect. For a version with a learning effect, see Appendix A. Introducing a learning effect makes the model considerably less computationally tractable 15 and the time-varying treatment component is Tqi = β A1 X 1qi + βFA1 X 1qi D iF . As before, X 1qi represents the treatment that was in effect when i produced her q t h unit of output, and may vary across different units produced on different days. We do not observe the latent accuracy ∗ index a qi directly. Instead, we observe a binary a qi ∈ {0, 1} where a qi 1 = 0 ∗ if a qi >0 otherwise This implies ∗ Pr[a qi = 1|Θ Aqi ] = Pr[a qi > 1|Θ Aqi ] = Pr[εqi > −Θ Aqi ] = 1 − Φ(Θ Aqi ) = Φ(Θ Aqi ). As in the case of leisure preference and productivity, we will construct a differencing estimator. In this case, however, we need to make adjustments to cope with the non-linearity in parameters introduced by the function Φ(·). We begin with a difference equation for the treatment effects β A1 and βFA1 . We then derive the estimator for the selection effects β A0 and βFA0 , and then finally we back out individual fixed effects θ Ai , given known values for treatments and selection. 2.5.1 Accuracy Treatment Effects Note that for all of i ’s units produced under the same task framing, the treatment effect is held constant. Let i ’s within-treatment accuracy score be defined as Y where Q N i ≡ ∑Q i A ji = ∑Q i qi =1 1[A qi =1 & X 1qi =1( j =C )] if 0 otherwise q i =1 1(X 1q i Q ji = 0) and QC i ≡ ∑Q i q i =1 1(X 1q i ∑Q i q i =1 1[X 1q i = 1( j = C )] > 0 j = C,N, = 1) are the totals of output units by i under task framing j = C , N . Whenever both Q N i > 0 and QC i > 0 we have ( ) Φ−1 Pr[A qi = 1|X 1qi = 1] = Θ Ai + β A1 + βFA1 D iF and ( ) Φ−1 Pr[A qi = 1|X 1qi = 0] = Θ Ai . If we substitute the empirical analogs of the left-hand sides of the equations above and difference them we get [ ( A ) ( A ) ] E Φ−1 Y C i + BCC i − Φ−1 Y N i − BC N i = β A1 + β A1 + D iF Where the expectation is taken across individuals i and the BC j i , j = C , N are known bias correction terms (which we define below) to adjust for finite sample variability of the empirical accuracy scores Y A ji . This establishes that the accuracy treatment effects are identified and estimable via a simple least 16 squares routine: [ β̂ A1 β̂FA1 2.5.2 ] { = arg min ′ [β A1 ,βFA1 ] I ( ∑ i =1 Φ −1 } )2 [ [ A ] [ A ] ] F F Y C i + BCC i − Y N i − BC N i − β A1 − β A1 D i 1 (QC i > 0) & (Q N i > 0) (26) Accuracy Selection Effects At this point we can consider the treatment effects as known objects and we now turn to selection which will involve difference equations across individuals in the two hiring stage treatments. Let ωrAlj i denote the empirical accuracy score for a person of gender l , hiring treatment group r , on days where task framing treatment j is in effect. Specifically, this is defined as ωrAlj i ≡ [ ] ∑ Qi F 1 (A =1) & (D =1[l =F ]) & (X =1[r =C ]) & (X =1[ j =C ]) qi 0i 1q i qi =1 i Q rl j i 0 where Q rl j i ≡ if Q rl j i > 0 l = F, M ; r = C , N ; j = C , N , otherwise ∑Q i q i =1 1 ] [ (D iF = 1[l = F ]) & (X 0i = 1[r = C ]) & (X 1qi = 1[ j = C ]) is the total number of units produced by person i , given his/her gender and hiring treatment, under task framing treatment j . By similar logic of above, it follows that [ ] ( ) E Φ−1 [ωlr j i ] + BC rl j i + 1[ j = C ] β A1 + βFA1 1[l = F ] ( ) = θ Ai + 1[r = C ] β A0 + βFA0 1[l = F ] , l = F, M ; r = C , N ; j = C , N , (27) where BC rl j i is a bias correction term as above. Since gender l and hiring treatment r are fixed, the above expression gives us up to two equations for each individual i . Note also that for each individual, the difference of these two equations leaves only the selection effect. However, before differencing we aggregate across all individuals to get ] ] [ [ ( )] [ l Al l F + BC − 1 [ j = C ] β + β 1 [l = F ] 1 Q > 0 ω Φ A1 r ji i =1 A1 r ji r ji [ ] . ∑I l 1 Q > 0 i =1 r ji ∑I Al Ωr j ≡ Now, since E[θ A |X 0 ] = E[θ A ] by definition because the selection effect is just the mean difference in permanent types, then by random assignment we have the following relationships: [ AF ] AF E ΩCC − ΩNC = β A0 + βFA0 , [ AF ] AF E ΩC N − ΩN N = β A0 + βFA0 , [ AM ] AM E ΩCC − ΩNC = β A0 , and ] [ AM AM E ΩC N − ΩN N = β A0 . (28) These equations establish that the selection effects are identified and estimable via a simple least squares 17 routine {[ ]2 [ AM ]2 } AM AM AM β̂ A0 = arg min ΩCC − ΩNC − β A0 + ΩC N − ΩN N − β A0 β A0 β̂FA0 = arg min βFA0 2.5.3 {[( ) ( AM ) ]2 [( AF ) ( AM ) ]2 } AF AF AM AF AM ΩCC − ΩNC − ΩCC − ΩNC − βFA0 + ΩC N − ΩN N − ΩC N − ΩN N − βFA0 . (29) Individual Fixed Effects Finally, with treatment and selection parameters known, it is easy to see why θ Ai is identified, since we observe individual i across multiple units of production. Moreover, we can estimate fixed effects for individual i with gender l and hiring treatment j using a simple weighted average [ θ̂ Ai = Q rl C i Q rl C i +Q rl N i + ( [ ] ) ( ) Φ ωrAlC i + BC rl C i − 1[r = C ] β̂ A0 + β̂FA0 1[l = F ] − β̂ A1 − β̂FA1 1[l = F ] Q rl N i Q rl C i +Q rl N i ( [ ] )] ( ) Φ ωrAlN i + BC rl N i − 1[r = C ] β̂ A0 + β̂FA0 1[l = F ] − β̂ A1 − β̂FA1 1[l = F ] , i = 1, . . . , I . (30) 2.5.4 Bias Correction The source of bias from finite sampling above comes from the fact that ( [ ]) [ ( )] Φ−1 E ωrAlj i ̸= E Φ−1 ωrAlj i due to the non-linearity in the quantile function Φ−1 . In particular, under a probit model (standard normal shocks) there will be a bias toward the median for finite Q i since the normal CDF is convex below the median and concave above it. However, one can use the known functional form of Φ and the sampling distribution of ωrAlj i to correct for the bias. Note that the empirical accuracy scores are sample means from Bernoulli random variables, meaning that the central limit theorem provides us with a sampling distribution theory. Specifically, if ∑N N {w n }n=1 w n /N is an iid sample of Bernoulli observations with mean ρ , then the sample mean ρ̂ = W = n=1 p is (approximately) normal with mean ρ and standard deviation ρ(1 − ρ)/ N . Using this information, one can compute the expectation [ ] E Φ−1 (ρ̂) = ∫ 1 0 ( ) ρ(1 − ρ) Φ (t ; 0, 1)φ t ; ρ̂, p dt, N −1 where Φ(·; µ, σ) and φ(·; µ, σ) are the normal CDF and PDF, respectively. 2.6 Estimation Our estimation strategy is to specify the functional form of the cost function c(·) and the distributions F j as B-spline functions. B-splines are convenient tools for parametric curve fitting since they can be 18 constructed to fit any continuous curve to arbitrary precision (see Hickman et al. 2016). But before we proceed, we need to clarify some technical hurdles concerning the estimation method and the distributions F j . In order to define a B-spline basis function, it is necessary to start with a fixed knot vector of K j + 1 points spanning the relevant compact domain for F j , j ∈ {P, A, L}. But this will be problematic since we do not know the upper and lower bounds of θ j . To circumvent this problem, we will instead parametrize and estimate the quantile functions Q j associated with each F j . This is possible since we know that the compact domain of Q j is [0, 1]. Our B-spline basis function for Q j is defined as follows: Q j (r ; α j ) = F j−1 (r ; α j ) = K∑ j +3 k=1 j α j k Bk (r ), j ∈ {L, P, A} where B j k : [0, 1] → R is the B-spline basis function determined by the knot vector k j , j ∈ {L, P, A}. The cost function is parametrized in a similar fashion: c(h; αc ) = K∑ c +3 k=1 αck Bkc (h) where Bck : [0, max(hi t )] → R. But in order for these functions to be valid CDFs and cost functions, we need to impose a number of shape restrictions. In particular, for each j ∈ {P, A, L}, we impose the following condition: Q ′j (r ; α j ) > 0, ∀r ∈ (0, 1) c ′ (h; αc ) > 0, ∀h ∈ (0, 1) c ′′ (h; αc ) > 0, ∀h ∈ (0, 1) and for the cost function c(0; αc ) = 0 In other words, both the quantile function and the cost function are monotone. By construction, a B-spline function is monotone if and only if its parameters are ordered monotonically. This implies: α j k ≤ α j (k+1) , ∀k = 1, 2, ..., K j − 2, j ∈ {L, P, A} αck ≤ αc(k+1) , ∀k = 1, 2, ..., K c − 2 The above conditions also imply convexity of the cost function. To see this, note that convexity of the B-spline function is imposed by α2ck ≤ α2c(k+1) , ∀k = 1, 2, ..., K c − 2 and the boundary condition is imposed by αc1 = 0 19 Finally, note that any B-spline function Q(x; α j ) = K ∑ αk B(x) k=1 can be expressed as Q(x; α) = B(x)α where the k ’th column of B(x) is Bk (x). Similarly, Q ′ (x; α) = B ′ (x)α where the k th column of B ′ (x) is Bk′ (x). In practice, the above conditions are imposed as constraints on a non-linear solver. The next step is to obtain the moment conditions necessary for pinning down the quantile function parameters. First note that for an ordered sample of observations {x (1) ≤ x (2) ≤ ... ≤ x (I ) } from a distribution F X (x) with density f X (x) and quantile function Q X (r ), r ∈ (0, 1), the density and distribution of x (i ) is given by ( ) [ ]I −i I −1 f X (i ) (x i ) = I F X (x (i ) )i −1 1 − F X (x (i ) ) f X (x (i ) ) i −1 Now, suppose we have a random sample {x i }iI =1 generated from some arbitrary distribution F X (x). We could generate a corresponding set of quantile ranks {r i }iI =1 through the transformation r i = F X (x i ) or x i = Q X (r i ), and note that R ∼ U (0, 1). If we order the observations, we get a set of order statistics {x (1) ≤ x (2) ≤ ... ≤ x (I ) } and {r (1) ≤ r (2) ≤ ... ≤ r (I ) } with x (i ) = Q X (r (i ) ) by monotonicity of Q X (·). From above, we know that E [r (i ) ] = and the density of R (i ) is i I +1 ( ) I − 1 i −1 f R(1) (t ) = I t (1 − t )I −i i −1 Therefore, we can conclude that ∫ E [X (i ) ] = 1 0 ( I −1 Q X (t ) f R(i ) (t )d t = I i −1 )∫ 1 0 Q X (t )t i −1 (1 − t )I −i d t , i = 1, 2, ..., I This fact gives us a basis for moment conditions to pin down αP , α A , and αL . Namely, for an ordered sample θ j (1) ≤ θ j (2) ≤ ... ≤ θ j (I ) we have ( I −1 E [θ j (i )] = I i −1 )∫ 1 0 Q j (t , α j )t i −1 (1 − t )I −i d t , ∀i = 1, 2, ..., I , j ∈ {P, A, L} (31) A final hurdle comes from the fact that we need a way of computing an empirical analog to the lefthand side of (31) that aggregates information from all observations to get a consistent estimator of E [θ j (i )]. It turns out that it is possible to analytically compute the asymptotic value of [ ] 1∑ S à j j θ̃ E θ(i ) = S s=1 s(i ) 20 j j j j j j given a sample of simulated order statistics {θ̃s(1) , θ̃s(2) , ..., θ̃s(I ) } generated by I draws from {θ1 , θ2 , ..., θI }. j j This is accomplished by computing P km = Pr(θ̃s(m) = θ(k) ) for a given simulated sample S so that I ∑ à j lim E [θ(m) ] = P km θ j (k) S→∞ k=1 The method for doing this is outlined in Appendix B. 3 Experimental Design We design a natural field experiment (Harrison and List 2004) to generate the data necessary for estimating the model. The experiment uses randomized treatments administered in two stages: (1) an hiring stage and (2) a real effort work stage. While the primary purpose of the hiring stage is to investigate the effect of CSR on the extensive margin, the focus of the work stage is the intensive margin. The two stages are, however, linked in the sense that any subject who participates in the work stage was also observed in the hiring stage. This approach allows us to investigate whether work stage productivity depends on hiring stage CSR selection. In the first stage, we hired subjects while varying wage and non-pecuniary treatment conditions. The second stage real effort task had subjects working with a simple data entry task through an online task system. The task system allows us to obtain detailed measures of output and labor input during each day of the experiment. Working subjects are assigned to treatments on a daily basis, which gives us both within and between subject variation. The experiment was designed with several important conditions in mind. First, we need randomized assignment to treatments in both stages. This is necessary to identify the model as the model implies that potential employees need to be offered different wage rates and given various degree of information about CSR. Second, we need to observe the same test subjects in both stages. (More precisely, at least a subset of the subjects who participated in the application stage need to be observed in the work task stage). This is crucial for separating treatment and selection effects. Third, we need to achieve as natural an environment as possible for the subjects to review the job application and to work. Fourth, the work task need to be uniform across treatments so as to allow for identification of the cost function. In order to satisfy all these conditions, we started an actual consultancy firm. The firm’s name is HHL Solutions, LLC and specializes in data collection and data management services for various clients. A website for the firm was also designed and published online. (See Appendix C for a screenshot from the firm website. The website did not play a direct part in the experiment but was set up in cases potential subjects began to search for the firm on the web.) Before we proceed, we discuss the necessity of actually running our own firm rather than partnering with an existing firm or using a crowdsourcing market such as for example mTurk. Since we want to investigate how information about wages and non-pecuniary factors affect decisions at the extensive margin, we need to hire a relatively large number of individuals for a short term job. This imposes a restriction on the set of firms with which we can partner. The empirical estimation procedure requires 21 detailed measurements of labor supply, output quantity, and output quality. This, in turn, necessitates the work task to be relatively simple and uniform across workers, further limiting the possibility of collaboration with an actual firm. The reason for starting a firm rather than using internet crowdsourcing marketplace is twofold. First, the subject pool in a crowdsourcing market is likely to exhibit certain characteristics that would be undesirable for our purposes. For example, subjects recruited via mTurk have a tendency to pay less information to experimental materials and have a different attitude towards money than subjects recruited from other populations (see Chandler et al. 2014; Goodman et al. 2013; Paolacci and Chandler 2014). This would be problematic in our case since the attitude towards pecuniary and non-pecuniary incentives is central to our research question, and our statistical power depends on subjects reading through the materials. Second, the manner in which we assign subjects to treatments in the second stage, and the degree of details in terms of output measures, require a large degree of flexibility in the online task system. This would make it quite difficult to operationalize the experiment on a crowdsourcing market. We now turn to a more patient discussion of each experimental stage. 3.1 Hiring Stage Employment advertisements were placed in 12 cities throughout the US on Craigslist.com. The particular cities were chosen based on size, geography, and the relative activity of the associated Craigslist page. The ad described a short term employment opportunity in general terms and provided some basic information about the employer. It was emphasized that the job consisted of a simple data entry task that requires no specific previous experience and that the position is temporary, lasting only a few weeks. The ad also stated that the wage would be somewhere in the range of $11-$15. Interested individuals were encouraged to request more information by replying to the ad via email. Individuals who responded to the ad were subsequently randomized into treatment groups. These subjects were sent an additional email containing a wage offer and a more detailed description of the firm and the job. The first stage treatments were administered via this email. Importantly, all subjects, regardless of treatment, were exposed to the same original job advertisement. This permits us to randomize subjects to treatments in a controlled manner. In contrast, if we were to use more than one version of the ad, there would be no way of controlling which ad each subject was exposed to and whether or not a given subject was exposed to one ad.3 The separation between the initial response and the implementation of the treatment also allowed us to balance the treatment cells based on gender. While we are not allowed to explicitly ask an individual about their gender, we can predict the gender based on the name. In particular, we used probabilities derived from the Social Security Administration database name popularity. The timing of the first stage of the experiment is summarized in figure 1. We used two different classes of treatments in the first stage: (1) two different wage levels, $11 and 3 This element of the design is similar to the one used by Flory et al. (2014) and Leibbrandt and List (2014) to study gender differences in attitudes towards competition and wage negotiations. A similar ”two-step” randomization procedure was used by Ashraf et al. (2010) to separate screening and sunk-cost effects in a field experiment on how to charge for health care products in developing countries. 22 Figure 1: Hiring stage timing $15, and (2) two different texts containing different amounts of information about the firm’s involvement in CSR. In terms of the notation from the model, the wage rate for subject i is represented by w i ∈ {$11, $15}. The CSR information is represented by the dummy variable X 0t defined by X 0i 1 = 0 ifsubject i informed about CSR in job description otherwise for i = 1, 2, ..., I . The wage and CSR treatments were crossed with each other, resulting in 4 treatment cells. The treatment cells are summarized in table 1. X 0i = 0 X 0i = 1 w i = $11 11 / Neutral 11 / CSR w i = $15 15 / Neutral 15 / CSR Table 1: Hiring stage treatments The ”Neutral” and ”CSR” treatments warrant some further clarification. In the two treatment cells with X 0i = 0, the task information email contained following information: ”The position is focused on data entry tasks for a number of our clients. We provide services for a variety of different firms and organizations.” The rest of the email specified the wage offer and outlined the task in more detail. In the two ”CSR” treatment cells with X 0i = 1, there were more information about the firm’s CSR endeavors: ”The position is focused on data entry tasks for a number of our clients. We provide services for a variety of different firms and organizations. Some of them work in the nonprofit sector with various charitable causes. For example with projects aimed at improving access to education for underprivileged children. We believe that these organizations are making the world a better place and we want to help them doing so. Due to the charitable nature of their activity, we only charge these clients at cost.” In order to avoid deception, we made sure to actually have at least one of each type clients: one for-profit firm and one non-profit organization. These particular clients were Uber Technologies, Inc. and school districts, such as Chicago Heights. The remaining part of the email was identical to the one of the Neutral treatment cells. The email ended by asking subjects to formally apply for the position— 23 if they are still interested after having been offered a wage rate and given more information about the task—by submitting their resume. After viewing the additional information provided in either treatment group, the subject makes the decision whether or not to apply for the job. This provides us with our first outcome variable: the application decision. By computing the share of the initial pool of subjects (the ones that responded to the original ad) that actually apply, we can estimate how the wage rate (pecuniary incentive) and the information about CSR (non-pecuniary incentive) affect the probability of applying. The application decision is an important outcome variable as it constitutes the extensive margin effect of our treatments. But the two treatment groups used in the first stage also give us two distinct groups of individuals entering the second stage of the experiment: those who applied after having been exposed to either treatment. In terms of the model, we obtain a vector X 0 = (X 01 , X 02 , , ..., X 0I ) where each element contains the X 0i dummy for each agent. Since there is a selection effect on (θiP , θiA , θiL ), across-type differences will be important in the analysis of the treatment effects in the work stage stage. 3.2 Work Stage 3.2.1 The Task A subset of those who applied for the job in the hiring stage was hired to participate in the work stage. Subjects were informed that they would be working for 10 days and that they were free to work any number of hours during the project period. In terms of the model, one day corresponds to a ”period” t so that t = 1, 2, ..., T in the experiment. Each subject worked from home through our custom-designed online task system. They were each provided with a username and a password for logging into the system at any time. The system allows us to track the number of hours worked by each subject each day, the quantity of output produced, and the quality of output. Upon logging into the system, subjects saw a dialog box explaining the task system itself and some information about the client associated with the work for the particular day. After reading this information, subjects were taken to the ”main screen”. Here, subjects were faced with a list of tasks that could be completed in any order. A task was initiated by clicking on the associated link taking the subject to a data entry screen. On the data entry screen, the subject is then presented with a snapshot from Google Streetview depicting a street in a major city in the US. The snapshot was sampled (without replacement) from a large database of pictures collected prior to the experiment. Below the snapshot was a web form with a number of text boxes and drop-down menus for entering data. The subject’s task was to complete the form using information contained in the snapshot. This implied observing in the picture and correctly entering this information for each variable listed. In total, there were 6 meta-data variables about the picture itself and 12 variables related to the street visible in the picture. Some of the variables require the subject to make an assessment of the quality of the street or the buildings visible in the picture and assign a rating on a Likert scale. Other variables required the subject to count the number of occurrences of certain objects in the picture (e.g. potholes and broken windows). Once all the information is recorded, the subject submits the task by 24 clicking on a button, taking him or her back to the main screen. The completed tasks were then marked as ”completed” in column in the middle of the main screen. Time was automatically tracked provided the subject was logged into the system. The subject could view the amount of active working time (recorded in minutes) on the main screen at any time. If the subject logs out they were once again presented with the task description when logging back in. The task was chosen so both our private clients and our non-profit clients could gain valuable information. In this way, we can convey work done for each client without deceiving workers. 3.2.2 Work Stage Treatments Since w i does not depend on t in the model, a given subject faced the same wage rate every day for the duration of the experiment. The information about the nature of the ”client” associated with the task of the particular day was, however, varied across treatments. Information about the client was presented to subjects via the information text displayed when logging into the system and in information boxes on the main screen. There were two different types of clients: ”for-profit clients” not associated with CSR and ”non-profit” clients associated with CSR. It was emphasized that the latter type of client works with improving access to education for underprivileged children and that our firm want to help these clients making the world a better place. As a consequence, our firm is only charging these clients at cost. This was our instrument for CSR in the work stage. (See Appendix C for sample screenshots from the task and main screens.) Subjects were randomized into a treatment at the beginning of each day. (At 12:00 am local time.) This implies both within and between subject variation in the second stage. In terms of the model variables, the work stage treatments are represented by the X 1i t dummy variable defined as X 1i t 1, = 0, ifsubject i associated with CSR client during day t otherwise for i = 1, 2, ..., I and t = 1, 2, ..., 10. 3.2.3 Work Stage Outcome Variables There are three principal outcome variables in the second stage: (1) number of hours worked, (2) quantity of output, and (3) quality of output. Quality of output is defined as the share of items correctly submitted. The number of hours worked hi t is simply the number of hours subject i spent logged into the system during day t . It is important to note that each subject is allowed to decide for himself how many hours to work each day. Our time tracking system has a built-in mechanism so that subjects will be automatically logged out (and paid time halted) if he or she stays inactive for longer than a specified interval of time. This prevents subjects from ”gaming the system” by logging in and earning money without actually working, while still allowing for some idle time. The automatic log-out mechanism also imply that we do not need to distinguish time spent on the data entry screen from time spent on the main screen. 25 We use two different measures of productivity. First, the familiar definition of productivity, output per unit of time qi t /hi t will primarily be of interest in the reduced form analysis. Second, we obtain a measure of τqi t by computing the accumulated amount of time it takes for a given subject to produce up until the q th unit of output. This measure will be used to identify the productivity parameters of the model. Finally, we obtain our measure of accuracy a qi t as whether or not unit q was correctly entered by subject i for task q in time t . A unit is correctly entered if what the subject reported as being in the Streetview snapshot is actually true. (This is verified ex post.) For Likert scale variables, we rely on the ”wisdom of the crowds” and the fact that each each individual picture was viewed by more than one subject. If a subject’s response for a given item and picture was more than two points away from the mean response for those individual also who worked on that same picture. 4 Reduced Form Results This subsection presents the reduced-form results from the experiment. We will begin by discussing the results from the hiring stage. The primary question of interest will be how different wage rates and non-pecuniary incentives provided by information about CSR affect labor supply at the extensive margin. In other words, how does information about investments in CSR affect the probability that a given individual applies for the job? We then present the results from the work task stage of the experiment. Here, we are interested in how wages and information about CSR affect labor supply and output and whether the results differ across hiring-stage treatment groups. This will allow us to address the question of whether, for example, subjects who were informed about CSR in the hiring stage of the experiment behave differently in the work stage. 4.1 Hiring Stage: Extensive Margin The recruitment stage of the experiment was conducted during the summer of 2016.4 The cities included in the experiment were Austin, Baltimore, Boston, Dallas, Houston, Indianapolis, Jacksonville, Los Angeles, New York City, Philadelphia, Phoenix, and San Diego. A total of N=1103 job offers were sent out via email to individuals who had expressed interest in the job. These individuals were randomly assigned to the four treatment cells: (1) $11 wage rate and no information about CSR (n=266), (2) $11 wage rate and information about CSR (n=309), (3) $15 wage rate and no information about CSR (n=277), and (4) $15 wage rate and information about CSR (n=251). Figure 2 shows the mean application rates, i.e. the share of subjects proceeding to apply for the job, in each treatment cell across the whole sample. Including information about CSR in the job description increases the application rate from 0.301 to 0.382 for subjects offered a $11 wage rate (Mann-Whitney U test: p=0.021). The corresponding change for subjects with a $15 wage rate offer is from 0.408 to 0.494 (MWU: p=0.024). The effect of the wage is only slightly larger. Upon changing the wage rate from $11 to $15, application rates among subjects exposed to the neutral information increases from 0.301 to 0.408 (MWU: p=0.009). For subjects with 4 A smaller-scale pilot experiment was conducted in April 2016. 26 Share applying for the job 0.6 0.5 0.494 0.4 0.408 0.382 0.3 0.301 0.2 0.1 0.0 $11/Neutral (n=266) $11/CSR (n=309) $15/Neutral (n=277) $15/CSR (n=251) Figure 2: Application rates across treatments: all subjects. (Error bars show 5% confidence intervals.) information about CSR, the corresponding increase is from 0.382 to 0.494 (MWU: p=0.008). There is no statistically significant difference in application rates between subjects with a $11 wage combined with information about CSR (0.382) and those with a $15 wage and information about CSR (0.408). We conclude that information about CSR does indeed increase the probability that a given individual applies for the job. The effect size is about 8.5 percentage points. In comparison, the increase in application rates due to change in the wage rate from $11 to $15 is roughly 11.5 percentage points. This implies that revealing information about CSR induces roughly the same response in application rates as an increase in the wage rate with about 30 percent. Robustness checks indicate no statistically significant difference in these results across cities. It is worth noting that the observed application rates and wage elasticities are roughly similar to those found in comparable studies (see for example Flory et al. 2014; Leibbrandt and List 2014). It is possible that men and women respond differently to wage and CSR incentives. To account for this possibility, we split the sample into male and a female subsamples. Figure 3 shows the application rates across treatments for women only. The first thing to note is that both the wage rate and the CSR information has a smaller effect in the female subsample than in the full sample. Increasing the wage rate from $11 to $15 increases the application rate by about 9 percentage points without information about CSR (MWU: p = 0.057) and with about 10 percentage points in the CSR treatments (MWU: p = 0.023). The CSR information increases application rates by rougly 5.5 percentage points in the $11 treatments (MWU: p=0.124) and by 7.8 percentage points in the $15 treatments (MWU: p=0.069). Figure 4 shows the corresponding application rates for the male subsample. It is clear that men show a greater response to both wage and CSR incentives. The effect of an increase in the wage from $11 to $15 range from 14 percentage points in the CSR treatments (MWU: p=0.032) to about 18 percentage 27 Share applying for the job 0.6 0.5 0.494 0.4 0.3 0.337 0.393 0.416 $11/CSR $15/Neutral 0.2 0.1 0.0 $11/Neutral (n=181) (n=224) (n=197) $15/CSR (n=168) Figure 3: Application rates across treatments: women. (Error bars show 5% confidence intervals.) Share applying for the job 0.6 0.5 0.500 0.4 0.3 0.2 0.392 0.357 0.214 0.1 0.0 $11/Neutral (n=84) $11/CSR (n=84) $15/Neutral (n=79) $15/CSR (n=80) Figure 4: Application rates across treatments: men. (Error bars show 5% confidence intervals.) points in the neutral treatment (MWU: p=0.007). Men also show a relatively large response to the CSR incentive; when advertising CSR, the application rate for men increases by 14.3 percentage points for the $11 wage (MWU: p=0.020) and by 10.8 percentage points for the $15 wage (MWU: p=0.087). In order to summarize the results at the extensive margin, we conclude that an increase in the wage rate from $11 to $15 increases application rates with about 33 percent. Interestingly enough, the effect 28 of revealing information about CSR is almost as large: around 25-26 percent. When comparing the application patterns between men and women, we find that on average, me are more price sensitive than women. This is consistent with results in the literature on charitable fundraising. Men are more likely to give to charity relative to women when the price of giving decreases (List 2011). We also find that men are more elastic than women with regards to the CSR incentive. This finding runs contrary to studies indicating that women are more responsive to social cues than men (Croson and Gneezy 2009). One possible explanation for this result is that data entry jobs predominantly are performed by women (our sample contains a much larger number of women than men) and the men who to select into the pool of possible applicants differ in their attitudes towards social cues such as CSR relative to the male population as a whole. Note that so far, we have not said anything about selection effects. Our results indicate that in certain cases, firms can increase the pool of applicants by roughly the same magnitude as the increase associated with a 30 percent wage increase by simply revealing information about CSR. Apart from the obvious fact that a firm might not gain from undertaking costly CSR investments just to expand the pool of applicants, there are more subtle ways in which non-pecuniary incentives like CSR information can backfire. In particular, the wage and CSR incentives may not attract the same types of employees; those attracted by a higher wage may differ from those attracted by CSR along important dimensions such as productivity, accuracy, and leisure preference. The results from the work task stage of the 5 5 4 4 3 3 Non-CSR tasks CSR tasks . Hours experiment will shed light on whether this the case. 2 2 1 1 0 $11 $15 0 (a) Mean hours worked by wage $11 $15 (b) Mean hours by wage and task type Figure 5: Mean hours worked by wage rate and task type in pooled data. (Pooling men and women of both CSR and non-CSR types.) Error bars show 5% confidence intervals. 4.2 Work Stage: Intensive Margin There are two different types of subjects entering the work task stage of the experiment: (1) ”CSR types” who selected into the job after having been exposed to CSR information in the hiring stage and (2) ”nonCSR types” who selected into the job without information about CSR. Subjects were randomly sampled 29 from each of the four hiring stage treatment cells so as to achieve a balanced sample of types and genders in the work stage. A total of 170 subjects participated in the work task stage of the experiment. Each subject was observed during a period of 10 days, which implies 1700 individual-day observations. Some subjects, however, did not supply a positive amount of hours every day; the mean number of days worked during the experiment across all subjects was 5.31. The mean daily number of hours worked, conditional on logging into the system, was 2.779 (SD = 3.086). On average, subjects produced 77.94 (SD = 87.16) units of output each day. Table 2 presents summary statistics for the work stage of the experiment. Mean Median SD Min Max Days worked 5.311 5 3.158 1 10 Hours worked per day 1.481 0.095 2.685 0 16.279 Hours worked per day |hi t > 0 2.788 1.608 3.151 0.001 16.279 Daily earnings ($) 37.956 21.990 43.896 0.012 179.069 Output 77.744 46.000 87.085 1 613 0.797 0.856 0.191 0.033 1 Accuracy Table 2: Summary statistics: work stage. Total number of subjects: N = 170. Figure 5 shows the effect of wage and CSR incentives in the full sample. (Here we pool men and women of both CSR and non-CSR types.) As expected, subjects who are paid a higher wage also supply more labor; when increasing the wage rate from $11 to $15, the mean number of hours worked increases from 2.475 to 2.962 (MWU: p=0.006). This implies an increase of roughly 30 minutes per day or a 20 percent increase in labor supply. Panel (b) of figure 5 splits each wage group into CSR and non-CSR tasks. Subjects appear to increase labor supply when working with CSR tasks. The effect of CSR is also similar in magnitude across wage levels; subjects work on average about 15-20 minutes longer with CSR tasks than with non-CSR tasks. The increase is, however, not found to be statistically significant in the pooled sample (MWU: p=0.130 for $11 and MWU: p=0.165 for $15). We now proceed to examine the effect of wage and CSR treatments separately for CSR and non-CSR type men and women. The patterns observed in figure 6 suggest that the effect of the CSR tasks vary across these four sub-groups. In particular, while there does not appear to be any significant effect for men, CSR type women exhibit quite a substantial increase in labor supply when working with CSR tasks. For the female subjects with a $11 wage, the mean number of hours worked increase from 1.86 for non-CSR tasks to 2.69 in CSR tasks (MWU: p=0.021), i.e. an increase by roughly 50 minutes per day. The corresponding increase for the female CSR type subjects with a $15 wage rage is even greater; from 2.266 to 3.326 hours per day (MWU: p=0.019). This implies an increase by around 63 minutes per day. Given the relatively small sample of men, we are not able to draw any definitive conclusions about the impact of CSR tasks on male labor supply. We now turn to the effect on productivity. Figure 7 shows output per hour for the same four sub- 30 6 5 5 4 4 3 3 . Hours 6 2 2 1 1 0 $11 0 $15 6 6 5 5 4 4 3 3 2 2 1 1 0 $11 $15 $11 $15 (b) CSR in hiring stage, women . Hours (a) Neutral in hiring stage, women Non-CSR tasks CSR tasks 0 (c) Neutral in hiring stage, men Non-CSR tasks CSR tasks $11 $15 (d) CSR in hiring stage, men Figure 6: Labor supply by sub-group. Error bars show 5% confidence intervals. groups as in figure 6 : CSR and non-CSR type men and women. There does not appear to be much of an effect on productivity from either the wage or the CSR incentives for the non-CSR types. (Note that we do not necessarily expect the wage rate to increase productivity since subjects are only paid by the hour.) For CSR-types, however, two striking patterns emerge. CSR-type women who work with CSR tasks appear to increase productivity from about 32.6 to 40.07 units per hour for the $11 wage group (MWU: p=0.077) and from about 39.7 to 43.4 units per hour in the $15 wage group (MWU: p=0.130). This implies that CSR type women do not only supply more labor when working with CSR tasks, they also produce more output per hour. For CSR-type men, however, we see a decrease in productivity with about 5-6 units per hour. This effect is, however, not statistically significant. Is the fact that CSR type women increase productivity when working with CSR tasks an indication of a shift in focus from quality to quantity? To answer this question, we examine figure 8 which shows our measure of quality, the share of correctly entered items, across treatments. It is clear that those 31 60 50 50 40 40 30 30 . Output per hour 60 20 20 10 10 0 $11 0 $15 60 60 50 50 40 40 30 30 20 20 10 10 0 $11 $11 $15 (b) CSR in hiring stage, women . Output per hour (a) Neutral in hiring stage, women Non-CSR tasks CSR tasks 0 $15 (c) Neutral in hiring stage, men Non-CSR tasks CSR tasks $11 $15 (d) CSR in hiring stage, men Figure 7: Productivity by sub-group. Error bars show 5% confidence intervals. women who increased their productivity when working with CSR tasks did not do so at the expense of quality. On the contrary, quality actually goes up when CSR-type women work with CSR tasks. CSR-type men, however, appear to be increasing their accuracy even more than the women. This is the case for both the $11 and $15 wage groups. Also note that non-CSR type men in the $15 wage group increase their accuracy when working with CSR tasks. This suggests that men to a greater degree than women respond to CSR incentives in terms of increased quality. These patterns are robust to several definitions of quality. In order to summarize the results on the intensive margin, we first note that workers who are paid a higher wage work longer hours. Moreover, for a given wage level, subjects work longer hours if they are working with a CSR task. This pattern is primarily driven by the effect of CSR incentives on women who selected into the job with information about CSR. In terms of productivity, women also respond strongly to CSR incentives. The finding that women appear to be more responsive to social cues such as CSR is consonant with insights from the literature (Croson and Gneezy 2009). Women 32 1.0 0.8 0.8 0.6 0.6 . Share correct 1.0 0.4 0.4 0.2 0.2 0.0 $11 0.0 $15 (b) CSR in hiring stage, women 1.0 1.0 0.8 0.8 0.6 0.6 . Share correct (a) Neutral in hiring stage, women $11 Non-CSR tasks CSR tasks $15 0.4 0.4 0.2 0.2 0.0 $11 $15 0.0 (c) Neutral in hiring stage, men $11 Non-CSR tasks CSR tasks $15 (d) CSR in hiring stage, men Figure 8: Accuracy by sub-group. Error bars show 5% confidence intervals. are also more productive when working with CSR tasks. This increase does not appear to be at the expense of of quality. Men who work with CSR tasks appear to be producing higher quality of output but at a slower rate. Overall, these patterns suggests that CSR incentives draw out higher output and productivity from women and higher quality for men. The stark contrast in performance and response to treatments between CSR and non-CSR types suggests that selection plays an important part in driving the results. 5 Structural Estimation Results 5.1 Leisure Preference Figure 9 shows the estimated cost and the marginal cost functions. Due to our normalization procedure, the functions presented in the figure correspond to the median worker in the $15 wage group. In order 33 to induce the median worker to supply 8 hours of labor, we would have to compensate the worker with a total amount of $500. This corresponds to an increase in the wage rate from $15 to $60 per hour. Also note that to induce between 10 to 16 hours of work, we would have to incentivize the workers quite substantial amounts. Again, it is important to note that this is true only for the median worker in our sample. The median worker in the $15 wage treatment supplied an average of 1.7 hours of labor each day. Those subjects who actually worked an average of 10 hours or more per day did so at a much lower compensation that what the median worker would require. Since we are estimating the homogenous part of the cost function c(·), the actual cost function for an individual with a higher (lower) cost of leisure would be scaled up (down) by a multiplicative constant, namely the individual θLi . Table 3 shows the estimated treatment effects for men and women. Recall that a lower θLi t implies a smaller alternative cost of time and, ceteris paribus, a greater labor supply. The treatment effect for males is βL1 = 0.9859. This estimate is, however, not significantly different from one. (Recall that the multiplicative form of θL implies that an estimate of 1 indicate no effect.) Hence, we do not find any evidence that men, conditional on selection, respond positively to work stage CSR incentives in terms of labor supply. The estimated treatment effect parameter for women is βFL1 . The total treatment effect for female workers is 0.78 × 0.9859 ≈ 0.769. This implies a decrease in leisure preference parameter with roughly 25% for women who are incentivized by CSR in the work stage. 600 3000 500 2500 400 2000 c(h) 1500 300 1000 200 500 100 0 MC(h) 0 2 4 6 8 10 12 14 0 16 (a) Cost function 0 2 4 6 8 10 12 14 16 (b) Marginal cost function Figure 9: Estimated cost and marginal cost functions. 5.2 Productivity Table 4 shows the estimated treatment and selection effects for the productivity parameter. A higher value of θP implies a more productive worker since a given unit of output takes less time to produce. First note that the overall baseline productivity type is 1.514 (SD = 0.1.001). There are no large dif- 34 Parameter Point estimate Standard error βL1 0.9859 0.0245 βFL1 0.7800 0.0441 Table 3: Estimates of leisure preference treatment effects for men and women. Type of effect on θP Parameter Learning Point estimate Standard error p-value δ 0.8556 0.0136 < 0.001 Treatment, males βP 1 0.9933 0.0271 0.805 Selection, males βP 0 1.2633 0.3243 0.417 Treatment, females βFP 1 βFP 0 βFP 0 × βP 1 βFP 0 × βP 0 1.0336 0.0337 0.319 1.1105 0.3485 0.751 1.0267 0.0189 0.158 1.4030 0.1877 0.033 Mean θP (SD) Baseline Neutral recruited CSR recruited Total sample: 1.514 (1.001) 0.996 (0.695) 2.032 (0.994) Male sample: 1.448 (0.961) 1.056 (0.756) 1.806 (1.003) Female sample: 1.537 (1.017) 0.976 (0.679) 2.116 (0.985) Selection, females Total treatment, females Total selection, females Table 4: Productivity parameter θP estimates. Reporting bootstrapped standard errors. ferences in baseline productivity types across genders.The male baseline productivity type is slightly lower than the overall baseline, 1.448 (SD = 0.961), while the female baseline productivity is slightly higher 1.537 (SD = 1.017). The estimated distributions of the productivity parameter for male and female workers is shown in figure 10. It is clear that the distribution of θP is very similar between male and female workers in the sample. There is a small tendency for female workers to have slightly greater productivity parameters in the upper part of the distribution. This difference is, however, not sufficiently large to make us conclude that there is any baseline productivity difference between men and women. This means that from the outset, before CSR incentives come into play, male and female workers start out with roughly the same productivity distributions. When we introduce CSR, a different pattern emerges. First, consider the treatment effect. Table 4 indicate that there is no significant response in productivity from work stage CSR incentives for men. The estimated treatment effect for male workers is 0.9933, not significantly different from one. The full treatment effect for women is only slightly larger: 1.0267. But the effect is not statistically significant. Since there are no significant treatment effects, we conclude that that work stage CSR incentives do not work particularly well in inducing higher productivity from male or female workers. This finding is supported by figures 11 and 12, showing the distribution of θP among treated and untreated workers. The distributions essentially coincide, implying that work stage CSR incentives 35 1.0 Estimated CDFs 0.8 0.6 0.4 Full Sample Males Females 0.2 1 2 3 4 5 θP Figure 10: Distributions of the productivity parameter 1.0 Estimated CDFs 0.8 0.6 0.4 0.2 0.0 Neutral treated sample CSR treated sample 0 1 2 3 4 5 θP Figure 11: Distributions of the productivity parameter by work stage incentive (both genders). have no effect on productivity regardless of baseline productivity level. We now turn to the selection effect. Figure 13 indicate that workers who select into the job under CSR incentives are significantly more productive than those who select in without incentives. From table 4, we see that the male productivity type among the neutral recruited workers is 1.056 while the the one for CSR recruited workers is 1.806. The corresponding productivity parameters for female workers 36 Estimated CDFs 1.0 0.8 0.6 0.4 0.0 1.0 Estimated CDFs Neutral treated sample: Females CSR treated sample: Females 0.2 0 1 2 3 4 5 0.8 0.6 0.4 Neutral treated sample: Males CSR treated sample: Males 0.2 0.0 0 1 2 3 4 5 θP Figure 12: Distributions of the productivity parameter by work stage incentive and gender are 0.976 and 2.116. These are quite substantial differences. The estimate for the male selection effect is positive but not statistically significant. (This is most likely due to the relatively small sample of males in our experiment.) In other words, it does not appear as if male worker productivity is affected by CSR incentives in either the hiring or work stage. The selection effect for women is, however, significant. The total selection effect for women is 1.403, meaning that women who select into the job under CSR incentives are about 40% more productive. These patterns are verified by figure 14. In particular, note the substantial difference in the distribution of θP for the different types of women in the sample. It is clear that the hiring stage CSR incentive bring in women who are a great deal more productive than those who select in without CSR incentives. Finally, while not of primary interest, we still note that we find evidence of learning by doing. As δ̂ < 1, workers become more productivity over time rather than showing signs of exhaustion. 5.3 Accuracy Table 5 shows the estimated treatment and selection effects for accuracy. A greater θ A means higher accuracy, ceteris paribus. Note that in contrast to productivity, the accuracy parameter may assume a negative value. The baseline accuracy parameter in the whole sample is 0.725 (SD = 0.797) while the baseline for men is 0.514 (SD = 0.519) and the baseline for women is 0.799 (SD = 0.863). This means that by baseline, women are considerably more accurate than men in our sample. This patten is reflected in figure 15 showing the estimated distributions of θ A in the whole sample and for male and female workers separately. It is clear that women are more accurate than men over most of the distribution of θ A . Most of the difference between men and women seem to come from some female workers with particularly high θ A . 37 1.0 Estimated CDFs 0.8 0.6 0.4 0.2 0.0 Neutral selected sample CSR selected sample 0 1 2 3 4 5 θP Figure 13: Distributions of the productivity parameter by hiring stage incentive (both genders). Estimated CDFs 1.0 0.8 0.6 0.4 0.2 0.0 1.0 Estimated CDFs Neutral selected sample: Females CSR selected sample: Females 0 1 2 3 4 5 0.8 0.6 0.4 Neutral selected sample: Males CSR selected sample: Males 0.2 0.0 0 1 2 3 4 5 θP Figure 14: Distributions of the productivity parameter by hiring stage incentive and gender We now turn to treatment effects on accuracy. Figure 16 show the distribution of θ A for treated and untreated workers. It is clear that the work stage CSR incentive shifts the distribution towards higher accuracy in the whole sample. From table 5 we see that the treatment effect for males is 0.6654 while the total treatment effect for women is 0.3265. This means that work stage CSR incentives induce higher accuracy for both men and women. The effect on male workers is particularly large. (This is partly 38 driven by the fact that men by baseline have lower accuracy than women.) These patterns are verified by figure 17. Work stage CSR incentives appear to shift the whole distribution of accuracy for men but primarily work on the middle of the distribution for women. (Again, since there are female workers with very high baseline accuracy, we do not expect a particularly large treatment effect on upper part of the distribution for women.) Type of effect on θ A Parameter Treatment, males Point estimate Standard error p-value β A1 0.6654 0.0871 0.00017 Selection, males β A0 0.2708 0.1394 < 0.00001 Treatment, females βFA1 −0.3389 0.1289 < 0.00001 Selection, females βFA0 βFA0 + β A1 βFA0 + β A0 0.0871 0.2060 0.00002 0.3265 0.0945 < 0.00001 0.3579 0.1536 < 0.00001 Mean θ A (SD) Baseline Neutral recruited CSR recruited Total sample: 0.725 (0.797) 0.560 (0.852) 0.890 (0.704) Male sample: 0.514 (0.519) 0.382 (0.395) 0.634 (0.593) Female sample: 0.799 (0.863) 0.618 (0.950) 0.985 (0.723) Total treatment, females Total selection, females Table 5: Accuracy parameter θ A estimates. Reporting bootstrapped standard errors. Just like in the case of productivity, we can examine the selection effect by observing the distribution of θ A for those in the sample who selected into the job with and without CSR incentives. Figure 18 indicates that selection is important primarily in the lower part of the distribution for the full sample. If we split these distributions by gender, we see that both male and female workers are affected by hiring stage CSR incentives. The overall pattern appear very similar to the treatment effect. Table 5 indicate a selection effect of 0.2708 on male workers and 0.3579 on female workers. While the treatment effect was greater for men than for women, the selection effect appear to be greater for women. This suggests that hiring stage CSR incentives works well in bringing in women with high accuracy. In order to summarize the structural analysis, we conclude that our results are largely consistent with the reduced form findings presented in the previous section. The baseline productivity levels are similar across male and female workers. CSR incentives primarily work in the hiring stage by inducing more productive workers to select into the job. This is true for both male and female workers, but in particular for female workers. Work stage CSR incentives do not seem to have any significant effect. For accuracy, we observe significant effects in both the hiring and the works stage. Female workers have higher accuracy by baseline. While the treatment effect induces higher accuracy from female workers, the baseline level is already so high that we only see limited improvement. Treatment effect on male workers is, however, particularly large, bringing them closer to the females in terms of accuracy. This means that the work stage CSR incentive causes a convergence in performance across men and women both in terms of accuracy. Selection also appear to be important for accuracy. Both male and 39 1.0 Estimated CDFs 0.8 0.6 0.4 Full Sample Males Females 0.2 −4 −3 −2 −1 0 1 2 3 4 5 θA Figure 15: Distributions of the accuracy parameter. 1.0 Estimated CDFs 0.8 0.6 0.4 0.2 0.0 −3 Neutral treated sample CSR treated sample −2 −1 0 1 2 3 4 5 θA Figure 16: Distributions of the accuracy parameter by work stage incentive (both genders). female workers respond well to hiring stage CSR incentives. The effect is particularly large for females. The estimated treatment effects on leisure preference appear to indicate that women increase their labor supply when incentivized by CSR. There is room for much further analysis regarding leisure preference. 40 Estimated CDFs 1.0 0.8 0.6 0.4 0.0 −3 1.0 Estimated CDFs Neutral treated sample: Females CSR treated sample: Females 0.2 −2 −1 0 1 2 3 4 5 6 0.8 0.6 0.4 Neutral treated sample: Males CSR treated sample: Males 0.2 0.0 −3 −2 −1 0 1 2 3 4 5 6 θA Figure 17: Distributions of the accuracy parameter by work stage incentive and gender. 1.0 Estimated CDFs 0.8 0.6 0.4 0.2 0.0 −3 Neutral selected sample CSR selected sample −2 −1 0 1 2 3 4 5 θA Figure 18: Distributions of the accuracy parameter by hiring stage incentive (both genders). 6 Concluding Remarks The ”business of business is business” has been a mantra associated with Milton Friedman for nearly half a century. Having articulated his view in a 1970 New York Times article, that the only responsibility 41 Estimated CDFs 1.0 0.8 0.6 0.4 0.0 −3 1.0 Estimated CDFs Neutral selected sample: Females CSR selected sample: Females 0.2 −2 −1 0 1 2 3 4 5 6 0.8 0.6 0.4 Neutral selected sample: Males CSR selected sample: Males 0.2 0.0 −3 −2 −1 0 1 2 3 4 5 θA Figure 19: Distributions of the accuracy parameter by hiring stage incentive and gender. of business was to increase its profits, he described contrarians as ”puppets of the intellectual forces that have been undermining the basis of a free society.” Today, over 90% of major businesses have specific programs dedicated to corporate social responsibility, and many CEOs trumpet their organization’s commitment to social engagement. Critics view such facts as strong evidence that Friedman was wrong. But was he? A narrow interpretation of Friedman certainly has proven incorrect, as the business world has evolved in a much more social manner than Friedman likely anticipated. This does not, however, mean that CSR is necessarily at odds with maximizing profits. This study takes a systematic approach to exploring the supply side effects of CSR. A great deal of previous empirical scholarly exercises have focused on the demand side with mixed results. This has led to hotly-contested debates in corporate law, business ethics, economics, and the social sciences more broadly. Our study combines theory with a natural field experiment to explore how a firm can use CSR to impact its labor supply and subsequent effort levels and productivities of workers. In this way, our study is unique in that we begin with a well specified model of CSR and estimate its behavioral parameters using a field experiment. Our results are generally positive for CSR, in that practical, market-based approaches can be used to affect the supply side of business. For instance, advertising the firm as a CSR firm increases the application probability by roughly the same amount as an increase of hourly wages from $11 to $15. In addition, once employed, CSR incentives matter at each wage level in that those working for CSR firms work more hours. In terms of output per hour, CSR incentives work to improve productivity of all workers, but especially women. Importantly, there is no concomitant decrease in product quality. 42 6 These gender differences in response to CSR may seem at odds with findings from some recent studies on the effect of CSR. In particular Bode and Singh (2014), who find no differences in CSR response across men and women in a consultancy firm. It is, however, possible that the type of men who select into a female-dominated market such as data entry differ from the general population just like those women who select into a male-dominated field such as management consulting. We conclude that CSR should not be viewed as a necessary distraction from a profit motive, but rather as an important part of profit maximization similar to other non-pecuniary incentives. Our results highlight that the real challenge for CEOs is not whether CSR should be used, but what are the best social innovations to put in place. We view our work as only a start on how to leverage economic theory and field experiments to lend insights into CSR. The next steps should refine our approach and utilize different forms and incentives of CSR to provide a more complete understanding of the underlying motivations and behaviors on both the demand and supply sides of the market. References Ashraf, N., Berry, J., and Shapiro, J. M. (2010). Can higher prices stimulate product use? evidence from a field experiment in zambia. American Economic Review, 100:2383–2413. Barnett, M. L. and Salomon, R. M. (2012). Does it pay to be really good? addressing the shape of the relationship between social and financial performance. Strategic Management Journal, 33(11):1304– 1320. Bode, C. and Singh, J. (2014). Taking a hit to save the world: employee participation in social initiatives. Technical report, Working paper. Burbano, V. C. (2015). Getting virtual workers to do more by doing good: Field experimental evidence. Technical report, Working paper. Burbano, V. C. (2016). Social responsibility messages and worker wage requirements: Field experimental evidence from online labor marketplaces. Organization Science, 27:1010–1028. Carnahan, S., Kryscynski, D., and Olson, D. (2015). How corporate social responsibility reduces employee turnover: evidence from attorneys before and after 9. Technical report, Working paper. Chandler, J., Mueller, P., and Paolacci, G. (2014). Nonnaivite among amazon mechanical turk workers: Consequences and solutions or behavioral researchers. Behavior research methods, 46(1):112–130. Croson, R. and Gneezy, U. (2009). Gender differences in preferences. Journal of Economic literature, 47(2):448–474. DellaVigna, S., List, J. A., and Malmendier, U. (2012). Testing for altruism and social pressure in charitable giving. The Quarterly Journal of Economics, 127:1–56. DellaVigna, S., List, J. A., Malmendier, U., and Rao, G. (2015). Voting to tell others. Technical Report 19832, NBER. 43 D’Haultfoeuille, X. and Février, P. (2015). Identification of nonseparable triangular models with discrete instruments. Econometrica, 83(3):1199–1210. Du, S., Bhattacharya, C., and Sen, S. (2011). Corporate social responsibility and competitive advantage: Overcoming the trust barrier. Management Science, 57(9):1528–1545. Elfenbein, D. W., Fisman, R., and McManus, B. (2012). Charity as a substitute for reputation: Evidence from an online marketplace. The Review of Economic Studies, 79(4):1441–1468. Flory, J. A., Leibbrandt, A., and List, J. A. (2014). Do competitive workplaces deter female workers? A large-scale natural field experiment on job-entry decisions. The Review of Economic Studies, 82(4). Friedman, M. (1970). The social responsibility of business is to increase its profits. The New York Times Magazine, pages 32–33. Gill, D. and Prowse, V. (2012). A structural analysis of disappointment aversion in a real effort competition. The American economic review, 102(1):469–503. Godfrey, P. C., Merrill, C. B., and Hansen, J. M. (2009). The relationship between corporate social responsibility and shareholder value: An empirical test of the risk management hypothesis. Strategic Management Journal, 30(4):425–445. Goodman, J. K., Cryder, C. E., and Cheema, A. (2013). Data collection in a flat world: The strengths and weaknesses of mechanical turk samples. Journal of Behavioral Decision Making, 26(3):213–224. Greening, D. W. and Turban, D. B. (2000). Corporate social performance as a competitive advantage in attracting a quality workforce. Business & Society, 39(3):254–280. Harrison, G. W. and List, J. A. (2004). Field experiments. Journal of Economic Literature, 42(4):1009–1055. Hickman, B. R., Hubbard, T. P., and Paarsch, H. J. (2016). Identification and estimation of a bidding model for electronic auctions. Quantitative Economics, forthcoming. Leibbrandt, A. and List, J. A. (2014). Do women avoid salary negotiations? A large-scale natural field experiment. Management Science, 61(9). Levitt, S. D. and List, J. A. (2007). What do laboratory experiments measuring social preferences reveal about the real world? The journal of economic perspectives, 21(2):153–174. List, J. A. (2011). The market for charitable giving. The Journal of Economic Perspectives, 25(2):157–180. McDonnell, M.-H. and King, B. (2013). Keeping up appearances reputational threat and impression management after social movement boycotts. Administrative Science Quarterly, 58(3):387–419. Paolacci, G. and Chandler, J. (2014). Inside the turk: Understanding mechanical turk as a participant pool. Current Directions in Psychological Science, 23(3):184–188. Roback, J. (1982). Wages, rents, and the quality of life. The Journal of Political Economy, pages 1257–1278. 44 Rosen, S. (1986). The theory of equalizing differences. Handbook of labor economics, 1:641–692. Servaes, H. and Tamayo, A. (2013). The impact of corporate social responsibility on firm value: The role of customer awareness. Management Science, 59(5):1045–1061. Torgovitsky, A. (2015). Identification of nonseparable models using instruments with small support. Econometrica, 83(3):1185–1197. Waddock, S. A. and Graves, S. B. (1997). The corporate social performance-financial performance link. Strategic management journal, pages 303–319. Wren, D. A. (2005). The history of management thought. John Wiley & Sons. 45 Appendix A: Accuracy with Learning Effect Accuracy measures whether or not unit q in period t was correctly entered by i . For our purposes, ∗ higher accuracy directly translates to greater output quality. Define the latent accuracy indexer a qi as t ∗ a qi = Θ Aqi + εqi (32) where εqi ∼ N (0, 1). The accuracy parameter is defined as Θ Aqi = Θ Ai + Tqi + β A2 ln(q i ) where Θ Aqi = θ Ai + β A0 X 0i + βFA0 X 0i D iF is the baseline accuracy type and Tqi = β A1 X 1qi + βFA1 X 1qi D iF is the treatment effect. Note that β A2 is a learning parameter introduced to account for both learningby-doing (higher quality over time) and the risk of exhaustion (lower quality over time). We do not ∗ observe the latent accuracy indexer a qi directly. Instead, we observe a qi t ∈ {0, 1} where t a qi 1 = 0 ∗ if a qi >0 otherwise This implies Pr[a qi = 1|Θ Aqi ] = Φ(−Θ Aqi ) where Φ(·) is the standard normal CDF. As in the case of leisure preference and productivity, we will construct a differencing estimator. In this case, however, we have to make due with non-linear difference equations. We begin with a difference equation for learning, we then derive difference equations for treatment and selection effects. Finally, we formulate a Probit estimator for the individual fixed effects. 6.0.1 Accuracy Learning Effects Let ∆a qi ≡ a qi − a qi −1 , qi ≥ 2 and note that ∆a qi −1 = 0 1 ∗ ∗ >0 < 0 and a qi if a qi −1 ∗ ∗ ∗ ∗ > 0) > 0 and a qi < 0) or (a qi < 0 and a qi if (a qi −1 −1 ∗ ∗ <0 > 0 and a qi if a qi −1 This implies Pr[∆qi = −1] = Φ(Θ Aqi )Φ(−Θ Aqi −1 ) 46 Pr[∆qi = 0] = Φ(Θ Aqi )Φ(Θ Aqi −1 ) + Φ(−Θ Aqi )Φ(−Θ Aqi −1 ) and Pr[∆qi = 1] = Φ(−Θ Aqi )Φ(Θ Aqi −1 ) With data on within-person differences of accuracy for units produced on the same day, this pins down the learning effect β2 given some Θ Ai , Tqi since E [∆a qi ] = Φ(−Θ Ai − Tqi − β A2 ln(q i ))Φ(Θ Ai + Tqi + β A2 ln(q i − 1)) −Φ(Θ Ai + Tqi + β A2 ln(q i ))Φ(−Θ Ai − Tqi − β A2 ln(q i − 1)) ≡ A (q i ; Θ Ai , Tqi ) The only part of E [∆a qi ] that changes with each ∆A qi is ln(qi ). Hence, we can write A (qi ; Θ Ai , Tqi ) as a function of just qi and the parameters Θ Ai and Tqi . 6.0.2 Accuracy Treatment Effects Let Y 1 Aji = A ji = ∑Q i 1[A =1 & X 1qi =1( j =C )] qi =1 qi if 0 otherwise ∑Q i q i =1 1[X 1q i =1( j =C )] ∑Q i q i =1 1[X 1q i ∑Q i qi =1 Φ(−Θ Ai −Tqi −β A2 ln(qi ))1[X 1qi =1( j =C )] ∑Q i qi 0 =1 1[X 1q i =1( j =C )] if = 1( j = C )] > 0 ∑Q i q i =1 1[X 1q i = 1( j = C )] otherwise j = C,N and note that for all i such that ∑Q i q i =1 1[X 1q i E = 1( j = C )] > 0, j = C , N ) ( 1 )] [( A A 1 Y C i − Y N i − AC i − A N i = 0 This pins down β A1 and βFA1 given som Θ Ai . 6.0.3 Accuracy Selection Effects Define ∑I ωrAl = i =1 ∑I 0l Ar = i =1 ∑Q i F q i =1 1[A q i = 1 & D i ∑I ∑Q i F i =1 q i =1 1[D i = 1(l ∑Q i = 1(l = F ) & X 0i = 1(r = C )] = F ) & X 0i = 1(r = C )] F q i =1 Φ(−Θ Aq i )1[D i = 1(l = F ) & X 0i = 1(r = C )] ∑I ∑Q i F i =1 q i =1 1[D i = 1(l = F ) & X 0i = 1(r = C )] 47 and note that E and E [( ) ( 0M )] 0M AM ωN − ωCAM − A N − AC =0 ) ( 0M )] [( ) ( ) ( 0F 0F 0M AF AM ωN − ωCAF − ωN − ωCAM − A N − AC − A N − AC =0 This pins down β A0 and βFA0 due to random assignment and [ ] E θ A |X 0i , D iF = E [θ A ] 6.0.4 Accuracy Fixed Effects By definition of the Probit model, we have E [A qi |Θ Ai , Tqi , q i ] = Φ(−Θ Ai − Tqi − β A2 ln(q i )) The above equations give us a combined NLS estimator of the following form: [ ] {θ̂ Ai }iI =1 , β̂ A1 , β̂FA1 , β̂ A0 , β̂FA0 , β̂ A2 = { arg min γ + I [( ∑ i =1 A Y Ci −Y A Ni Qi [ I ∑ ∑ ]2 ∆A qi − A (q i , Θ Ai , Tqi ) 1[Tqi = Tqi − 1] } i =1 q i =1 [ ] Qi Qi ) ( 1 )]2 ∑ ∑ 1 − AC i − A N i 1 1[X 1qi = 1] > 0 & 1[X 1qi = 0] > 0 q i =1 q i =1 ) ( 0M )] [( 0M 2 AM − ωCAM − A N − AC + ωN [( ) ( ) ( 0F ) ( 0M )] 0F 0M 2 AF AM − ωCAF − ωN − ωCAM − A N − AC − A N − AC + ωN { + Qi ( I ∑ ∑ i =1 q i =1 } ) A qi − Φ(−Θ Aqi )2 48 Appendix B: Analytical Solution for E [θ j (i )] This appendix presents a method for analytically computing an empirical analog to the left hand side of ( I −1 E [θ (i )] = I i −1 )∫ 1 j 0 Q j (t , α j )t i −1 (1 − t )I −i d t , ∀i = 1, 2, ..., I , j ∈ {P, A, L} Commonly used methods employ the empirical quantile function: { I 1∑ j Qb j (r ) ≡ inf ρ : 1(θ ≤ ρ) ≥ r I i =1 i } But this introduces kinks into the GMM objective function which complicates asymptotic theory and standard errors. Alternatively, one could consider a simulation-based estimator of quantile i /(I −1) for some i = 1, 2, ..., I by the following procedure: j j j 1. For s = 1, 2, ..., S , sample I draws from {θ1 , θ2 , ..., θI } with replacement, and order the observations j j to get a sample of simulated order statistics {θ̃s j (1) , θ̃s(2) , ..., θ̃s(3) } 2. Estimate S 1∑ à j j E [θ(i ) ] = θ̃ S s=1 s(i ) for any desired i = 1, 2, ..., I . This simulation method, however, introduces simulation error and also complicates asymptotic theory and standard errors as well. à j Iit is, however, possible to analytically compute the asymptotic value of E [θ(i ) ] as S → ∞. To do so, j j j j j begin with the ordered data {θ(1) ≤ θ(2) ≤ ... ≤ θ(I ) } and note that if we can compute P km = Pr(θ̃s(m) = θ(k) ) for a given simulated sample S , then I ∑ à j lim E [θ(m) ] = P km θ j (k) S→∞ k=1 which eliminates simulation error altogether. Note that j j P km = Pr(θ̃s(m) = θ(k) ) = I ∑ t =1 Pr(exactly t draws = θ j (k) & m draws ≤ θ j (k)) j Assuming that all observations θi are unique, then the probability that (l draws are < θ j (k)) ∩ (t draws are = θ j (k)) ∩ (l − t − l draws are > θ j (k)) is given by a multinomial probability ( ) ( ) ( ) ( [ ]) I! k − 1 l 1 I −t I − k I −t −l k −1 1 I −k = M N [l , t , I − t − l ], , , t !l !(I − t − l )! I I I I I I 49 (33) The intuition here is that when sampling with replacement from the empirical distribution, every observation has uniform weight 1/I , the fraction of values ranked below k is (k − 1)/I , and the fraction ranked strictly above k is (I − k)/I . Whenever there are m simulated draws that are ≤ θ j (k) with t total draws begin equal to θ j (k), that opens up various mutually exclusive possibilities for how many draws are strictly below a value of θ j (k). For example, suppose that t = 3. Then if m = 1, there can be no simulated draws from the data strictly below θ j (k) but if m = 4, ..., I −3 then the four possibilities are for exactly 0, 1, 2, and 3 draws below θ j (k). If we extend this logic to arbitrary t , m , k , and I , we can rewrite (33) as Pr(exactly t draws = θ j (k) & m draws ≤ θ j (k)) = ( [ ]) I min{m−1,I ∑ ∑ −t } k −1 1 I −k M N [l , t , I − t − l ], , , I I I t =1 l =max{0,m−t } (34) By (34), we can explicitly compute P km , ∀(k, m) pairs with k ≤ I and m ≤ I , and analytically compute I ∑ à j lim E [θ(m) ] = P km θ j (k) S→∞ k=1 j using the ordered values of {θi }. But for our quantile function estimator, we pick a set of k quantile ranks { r j = {r 1 , r 2 , ..., r k ∗j } ⊆ I i , ..., I +1 I +1 } with k ∗j of free parameters in Q j (r ; α j ), j ∈ {P, A, L} and making sure we include in the set r 1 = i /(I + 1) and r k ∗j = I /(I + 1), and at least one quantile rank in each subinterval defined by the know vector k j . The probability matrix P is somewhat expensive to compute, but it need only be computed once for a fixed sample size I quantile rank vector r j , and can be stored and re-used in each iteration: P 1,r 1 (I +1) P 1,r 2 (I +1) . . . P 1,r k∗ (I +1) P 2,r 1 (I +1) P 2,r 2 (I +1) . . . P 2,r k∗ (I +1) P= .................. P I ,r 1 (I +1) P I ,r 2 (I +1) . . . P I ,r k∗ (I +1) 50 Appendix C: Hiring and Work Stage Experimental Material Figure 20: Screenshot of firm website Figure 21: Example login information: for-profit client 51 Figure 22: Example login-information: CSR client Figure 23: Example main screen: for-profit client 52 Figure 24: Example main screen: CSR client Figure 25: Example data entry screen (top half) 53 Figure 26: Example data entry screen (bottom half) 54
© Copyright 2026 Paperzz