the paper here - Personal Websites

Job market paper: Daniel Hedblom
Toward an Understanding of Corporate Social Responsibility:
Theory and Field Experimental Evidence*
Daniel Hedblom†, Brent R. Hickman, and John A. List
The University of Chicago
.
This version: December 7, 2016
Download latest version: http://jmp.danielhedblom.com
Corporate social responsibility (CSR) has been on the business landscape for more than half a century.
Advocates of CSR argue that such activities can increase long-term profits and social well-being, while
critics argue that CSR serves as a major distraction. Social scientists and business scholars are largely
divided on the value of CSR, with most of the focus on the demand side. We develop a theory and
a tightly-linked field experiment to explore the supply side impact of CSR. Our natural field experiment, in which we created our own firm and hired actual workers, generates a rich data set on worker
behavior and responses to CSR incentives. We use these data to estimate a structural principal-agent
model with three dimensions of worker heterogeneity: productivity, work quality, and value of time.
This allows us to answer a variety of questions related to treatment effects (on existing employees’
behaviors) and selection effects on the pool of applicants from CSR. We find strong evidence that
when a firm convinces its workers that their efforts make the world a better place (as opposed to just
making money) it will attract workers that are more productive, produce higher quality work, and
have more highly valued leisure time. We also find an economically significant treatment effect of
CSR on improving work quality of existing employees as well. Our research design may serve as a
framework for causal inference on more general forms of non-pecuniary incentives in the workplace.
JEL Classifications: C14, J22, J30, M52
* We thank Stephane Bonhomme, Larry Schmidt, Steven Levitt, and Emir Kamenica for helpful comments. Special thanks
to Joseph Seidel for excellent research assistance. The authors wish to thank seminar participants at The University of
Chicago, University of Wisconsin, the Economic Science Association North American meeting, and the University of Chicago
Advances with Field Experiments conference for helpful feedback.
† Corresponding author. E-mail: [email protected]. Address: The University of Chicago, Department of Economics,
1126 East 59th Street, Chicago, Illinois 60637.
1
1
Introduction
Economic historians commonly mark the Industrial Revolution as a major historical moment, as every
aspect of life was permeated—Incomes and the population began to grow, and the standard of living
began a consistent climb for the first time in history. While few would argue with such facts, the
Industrial Revolution also marked a landmark change in social responsibility of business organizations.
Whereas many scholars argue that corporate social responsibility (CSR) has its roots in the 1950s, the
framework began to emerge in the 19th century. As Wren (2005) points out, with the new factory
system in Great Britain came social unrest due to concerns surrounding poverty, slums, and child labor.
Businesses soon responded with the industrial welfare movement, which included introduction of
many worker perks (see pp. 269-270, Wren 2005). Whether such actions were profit motivated, socially
motivated, or a mixture remains debated in the literature.
Amongst scientists, the virtues of CSR have been debated at least since Friedman (1970) famously
described CSR as a ”fundamentally subversive doctrine,” and declared that the only social responsibility of a firm is to maximize its profits.1 Nevertheless, today CSR is a common business practice, as
most large firms have entire branches dedicated to socially responsible concepts and practices. Indeed,
hundreds of millions annually are budgeted on such programs.
One economic explanation for the prevalence of CSR is that making the world a better place is
not necessarily at odds with profit maximization. For instance, proponents of CSR have argued that
investments in CSR actually increase profits. There is, however, no consensus in the empirical literature
on whether investments in CSR have a positive effect on the firm’s bottom line. While some studies
report a positive effect (Waddock and Graves 1997), others find mixed, negative, or no effects (Barnett
and Salomon 2012; Godfrey et al. 2009; Servaes and Tamayo 2013).
A possible reason for these mixed findings is that with a few notable exceptions, empirical studies of CSR tend to focus primarily on the demand side of the market (Du et al. 2011; Elfenbein et al.
2012; McDonnell and King 2013; Servaes and Tamayo 2013). The demand side explanation posits that
firms use CSR as a means to signal certain ethical or social values to consumers who have specific
preferences. In other words, CSR can be seen as a form of marketing or product differentiation. While
plausible, such efforts ignore the effect of CSR on the supply side of the market, or how one can utilize such signaling to impact the firm’s own employees. Under this view, investments in CSR provide
non-pecuniary incentives to (i) encourage a greater variety, or number, of potential employees to seek
employment at the firm, and once employed CSR can (ii) induce higher effort levels from employees,
which increases labor productivity. Indeed, there are some recent studies of the supply side effects of
CSR, but these tend to focus on employee retention or wage requirements (see Carnahan et al. 2015;
Burbano 2016).
In this study, we use a natural field experiment to identify important features on the supply side
of a structural econometric model. The central premise is that workers respond heterogeneously to
CSR, allowing us to estimate important behavioral parameters. Workers are willing to supply labor,
1 It should be noted that Friedman (1970) added some important qualifiers to the end of his statement about CSR. In
particular, the only social responsibility of the firm is the maximize its profits as long as it ”stays within the rules of the
game” and ”avoids deception or fraud”.
2
produce output, and pay attention to quality based, in part, on their social preferences. Under such a
model, investments in CSR have an effect on the intensive margin. Yet, heterogeneous preferences for
CSR also cause workers to differ in their propensity to select into different jobs. A worker who prefers
to work for a CSR firm will be more likely to apply for a job within this firm than with a non-CSR firm,
ceteris paribus, inducing an effect of CSR on the extensive margin. Our natural field experiment permits
an estimation of effects along both the intensive and extensive margins.
We begin the analysis by presenting the structural model. The model provides direction into the
exogenous variation necessary to quantify unobservable worker characteristics, such as motivation and
preferences. Our model draws upon the insights of Torgovitsky (2015) and D’Haultfoeuille and Février
(2015). The identification argument consists of deriving two mappings between types of agents, wages,
and hours worked. Once these mappings have been obtained, it is possible to retrieve the worker cost
function and the distribution of cost types in the population.
We extend this line of work in several directions. First, agents are characterized by a three-dimensional
vector of parameters representing leisure preference, productivity, and accuracy.
Second, since each agent chooses the utility-maximizing number of hours each workday, we introduce individual and day-specific shocks to the time budget constraint, and individual, day, and unitspecific shocks to productivity and quality. This allows for more flexibility and prevents the model
from being rejected if an agent is observed working a different number of hours during different days.
To provide the necessary experimental control and variation to estimate the model, we started our
own consultancy firm, HHL Solutions, LLC and used it as a vehicle for conducting the field experiment.
The stated mission of the firm is to provide data collection services for various clients. Every aspect
of the firm’s origination and organization is authentic. By starting our own firm, we are not only
able expose subjects to controlled variation in wages and information about the firm’s investments in
CSR, but we also achieve a high degree of control over the hiring process and the work environment.
Since subjects are recruited from an actual marketplace, we also improve upon the external validity by
avoiding sample selection problems than can occur when subjects are recruited from a very specific
population (Levitt and List 2007).
Our field experiment consists of randomized treatments administered in two stages: (1) a hiring
stage and (2) a real effort work task stage. Both stages generate data on worker behavior when faced
with different wage contracts and non-pecuniary incentives in the form of information about the firm’s
investments in CSR. The field experiment is tightly linked to the theoretical model to ensure the appropriate instruments and to produce sufficient variation in the data such that the parameters of interest
can be accurately estimated.
In the hiring stage of the experiment, we recruit subjects via job advertisements placed on online
marketplaces for jobs in 12 cities throughout the US. We vary the offered wage rate and how much
information is revealed about our firm’s involvement in CSR. Application rates in each treatment group
is the first outcome variable. This allows us to investigate how pecuniary incentives (the different
wage rates) and non-pecuniary incentives (information about CSR) affect labor market decisions on
the extensive margin. In particular, we are able to estimate whether non-pecuniary incentives in the
form of CSR information increase the probability that a given individual applies for the job. The hiring
3
stage serves one more important feature: to provide an instrument for the selection effect of CSR on
productivity, accuracy, and leisure preference.
In the second stage of the experiment, subjects who applied for the job proceed to conduct a real
effort work task through a custom-designed online task system. The task system provides us with
detailed information about the performance of each employee and allows us to construct measures of
labor supply, productivity (both in terms of output per hour and the amount of time it takes to produce
a given unit of output), and output quality.
Treatments in the work stage vary information about the client associated with each day’s work.
This allows us to estimate the intensive margin by turning on and off CSR incentives by denoting the
work as either ”CSR work” or not mentioning CSR. Importantly, both those who are recruited for a
CSR job and those who are recruited without information about CSR eventually work for CSR and
non-CSR clients once employment commences.
Several reduced form insights emerge. First, as expected, higher wages increase application rates;
an increase of wages from $11 per hour to $15 per hour increases application rates by about 33 percent.
Interestingly, advertising the firm as a CSR firm increases the application probability by roughly the
same amount as an increase of hourly wages from $11 to $15. In this way, both wages and CSR have
important effects on the extensive margin. Due to heterogeneity in preferences, individuals who are
more likely to respond to CSR may perform differently while working on the actual job. Since we
observe the behavior in the first stage for all subjects who participates in the second stage, we are able
to test whether this is the case. A sub-result under this first finding is that men appear to be more price
sensitive than women, and also more elastic to CSR appeals. The former insight consonant with the
literature on charitable fundraising (List 2011). The latter insight is, however, at odds with the literature
that suggests women are more responsive to social cues than men (Croson and Gneezy 2009).
Second, once employed, again both wages and CSR incentives matter. For example, workers paid
higher wages log more hours. And, at each wage level those working for CSR firms work more hours.
Importantly, the initial selection of workers affects ultimate productivity, especially for women. For
instance, those who we recruit to work for a CSR firm are much more sensitive to working for a CSR
firm than those who are recruited with no mention of CSR. In terms of output per hour, women respond
very strongly to CSR incentives whereas men have a statistically insignificant response. This result is
consonant with the insights from the literature (Croson and Gneezy 2009).
Third, there is no concomitant decrease in accuracy associated with CSR type women. Even though
they produce more per hour, their production accuracy does not suffer. Yet, for men, their product
quality actually increases when employed by a CSR firm. Together, these insights suggest that CSR
draws out higher output from women and higher quality from men.
In order to complement the reduced-form analysis, we use the data generated by the experiment
to estimate the structural model. This allows for an in-depth analysis of a wide array of questions. In
particular, how the impact on labor supply, productivity, and accuracy channels through selection and
treatment effects. This is an important feature of our methodology. Structural economic techniques use
statistical methods in combination with a theoretical model of worker behavior answer questions about
unobservable variables. Given this close connection between the model and the empirical method, the
4
experiment is designed with the model in mind in order to establish an ideal data-generating process
for estimating the components of the model. As a consequence, the model yields results that cannot
be derived directly from the data and is particularly useful for generalizing the results outside the
experimental setting.
We obtain a wealth of insights from the structural results. The baseline distribution of productivity
is comparable between male and female workers. Moreover, there does not seem to be any significant
treatment effects on productivity for either male or female workers. Hiring stage CSR incentives have
a significant impact on female worker productivity, however; for a given level of on-the-job experience,
women who select into the job under CSR incentives are about 40 percent more productive that those
who select in with neutral information. For accuracy, we find that women by baseline are more accurate
than men. But both selection and treatment effects induce men to increase their accuracy. This causes
a convergence in accuracy patterns between male and female workers.
This study is related to several strands of literature. To the best of our knowledge, the only previous
studies using field experiments to investigate the impact of CSR on labor market behavior are the ones
of Burbano (2015, 2016). Through field experiments conducted via mTurk and Elance, Burbano (2016)
estimates the reduced-form effects of CSR messages on labor output and salary requirements. The
results show that CSR adds value to the firm both through increased output and by lowering employee
salary requirements. A particularly interesting result from this study is that higher performing workers
appears to also be the ones with the greatest compensating differential. These workers are willing
to give a larger share of the wage to work for a firm that invests in CSR. Burbano (2015) uses a field
experiment to explore the effect of CSR on virtual workers, and find that CSR increases virtual worker’s
willingness to do extra, unrequired work.
This paper also contributes to the broader empirical literature on the effects of CSR on firm performance (Greening and Turban 2000). The question has primarily been addressed within the management literature. We contribute with an econometric methodology for investigating the supply side
effects of CSR.
We also contribute to the relatively new strand of literature within economics which uses experiments to generate data for structural models. Previous work has employed field experiments for structural analyses of, for example, charitable giving (DellaVigna et al. 2012), voting behavior (DellaVigna
et al. 2015), and disappointment aversion (Gill and Prowse 2012). We contribute to this literature by
providing an in-depth labor market application.
Finally, this study contributes to the literature on equalizing differences (Roback 1982; Rosen 1986).
Our structural analysis of selection effects constitutes a new way of estimating compensating differentials or equalizing differences across jobs that vary non-pecuniary incentives.
The methodology presented in this paper has large applicability in areas beyond the study of CSR.
One could readily use an almost identical method to estimate the effect of virtually any workplace
characteristic over which there are heterogeneous worker preferences. For example, flexible hours and
workplace competition.
The rest of the paper proceeds as follows. Section 2 outlines the theoretical model and discusses
its identification. The identification argument provides a number of insights regarding how the ex-
5
periment must be designed to achieve the proper variation in the data. Section 3 describes the field
experimental design explains how it enables us to identify the parameters of the model. Section 4
presents the reduced form results from the field experiment. This includes both the results at the extensive margin and those at the intensive margin. Section 5 presents the results from the structural
model. Finally, section 6 concludes the paper.
2
The Model
This section presents the model and its identification. The identification argument will reveal what
kind of variation we need to observe in the data in order to identify the parameters of interest. Since
theory guides the the experimental design, we will postpone the discussion of the natural field experiment until the next section of the paper.
We begin by outlining a simplified version of the model. The purpose is to explain how the results
of D’Haultfoeuille and Février (2015) and Torgovitsky (2015) apply to the problem at hand. In the
simplified model, agent heterogeneity is represented by a one-dimensional cost parameter and the only
observed output is labor supply. The simple model does not include non-pecuniary CSR incentives.
Instead, we confine the discussion to how exogenous variation in wages allows us to identify the cost
function and the distribution of the individual cost parameter.
We then extend the model to account for individual heterogeneity in three dimensions: leisure
preference, productivity, and accuracy. We also introduce day and individual-specific shocks to these
three parameters. This allows for more flexibility and prevents the model from being rejected when we
observe a given individual working a different number of hours during different days. Finally, when
agents are observed during several time periods, we may occasionally observe zero output. This is
simply due to the fact that some agents, given their time constraint, does not find it desirable to work
every single day. We therefore adapt the model to unbalanced panel situations with missing data.
2.1 Identification in The Baseline Model
Suppose that i = 1, 2, ..., I agents are randomly assigned to a wage contract with wage rate w i ∈ {w 1 , w 2 }
where w 2 > w 1 . The cost of working h hours is given by the function C (h; θi ) = θi c k (h) where θi ∼ Fk ,
k ∈ {1, 2} is an idiosyncratic cost-shifting parameter representing agent i ’s shadow value of time. The
distribution Fk is unknown and to be estimated. A greater θi means that the alternative cost of working
is higher and that we expect the agent to work fewer hours than agent with a lower θi if they are paid
the same hourly wage rate. We may think of θi as representing individual i ’s leisure preference. The
homogenous part of the cost function c k (h) is assumed to be increasing and strictly convex: c ′ (h) > 0,
c ′′ (h) > 0. Each agent is assumed to choose a number of hours h ∗ so as to maximize utility, defined by
income net of cost:
h ∗ = arg max {w i h − θi c k (h)}
h≥0
6
Since the cost function is increasing and convex, there exists a unique solution satisfying the FOC:
w i − θi c ′ (h ∗ ) = 0
Note that the expression
θi (h) =
wi
c ′ (h)
is decreasing and monotone since c ′ (h) > 0.
Our goal is to identify the distribution Fk and the cost function c(h) given the empirical distributions
of hours. To do so, we start by retrieving a function that maps between hours and cost types for each
wage contract. Consider the limit in which I → ∞ and we observe output generated by every possible θ .
Let θ (k) (h) denote the cost type that would induce h hours under wage k . The following two exclusion
restrictions are necessary for identification:
F1 = F2 = F
(1)
c 1 (h) = c 2 (h) = c(h)
(2)
Restriction (1) says that agent cost types are drawn from the same distribution and do not depend on
the wage. This assumption is motivated if there is no selection of agent types into wage contracts.
The second exclusion restriction (2) says the homogenous part of the cost function is the same across
wage contracts. This is motivated if task is uniform across contracts. Note that both assumptions are
easily motivated by proper experimental design; with controlled randomization of subjects to wage
treatments there will be no selection. Similarly, with a sufficient degree of control over the experimental
setting, it is possible to design a task that is uniform across treatments.
Suppose agent i is assigned to wage contract w 1 and work hi hours. This implies that θi satisfies
′
θi c (h i ) = w 1 . Now, suppose another agent j works the same number of hours h j = h i = h under wage
contract w 2 . This implies that θ j > θi since j requires a higher wage rate to work the same number
of hours as i . By using the FOC, we can derive a relationship between the cost parameters associated
with i and j . In particular:
θi =
w1
,
c ′ (h)
θj =
w2
c ′ (h)
θ2 (h) =
⇒
θj =
w2
θ1 (h)
w1
w2
θi
w1
⇒
(3)
Equation (3) provides a mapping between cost types who work the same number of hours across wage
contracts.
Suppose that we instead fix the cost type and ask how many hours an agent working under wage
contract w 1 would work if assigned to the contrafactual wage contract w 2 . To answer this question,
we use the fact that an agent’s observed number of hours is monotonic in θi . For example, the agent
with the median θi is going to work the median number of hours under either wage contract. Hence, if
agent i with cost type θi works hi hours under wage contract w 1 , and hi is the median in the empirical
7
distribution of hours among the agents working under w 1 , then we conclude that i would also work
the median number of hours in the empirical distribution for w 2 , had the agent been assigned to w 2 . In
particular, let G k denote the empirical distribution of hours for k ∈ {1, 2}. Since F (θ) = 1−G 1 (h) = 1−G 2 (h),
we have
[
]
h (2) (θ) = G 1−1 G 1 (h (1) (θ))
(4)
Once the mapping (3) between cost types for a fixed number of hours and the mapping (4) between
hours for a given cost type have been established, it is possible to obtain a sequence of points on the
graphs of θ (1) (h) and θ (2) (h). To do so, we begin by normalizing a given θ ∗ to use as a starting point.
For some (θ ∗ , h ∗ ) ∈ [θ, θ], let θ A (h ∗ ) = θ ∗ . This amounts to fixing total cost units since
C (h; θ) = θc(h) ⇔ C̃ (h; θ) =
ιθc(h)
,
ι
ι ∈ R+
By the argument of D’Haultfoeuille and Février (2015) and Torgovitsky (2015), given (θ ∗ , h ∗ ), (3)
and (4) identify θ (k) (h), k ∈ {1, 2} if there is a crossing point between θ (1) (h) and, θ (2) (h). Otherwise,
θ (k) (h), k ∈ {1, 2} is partially identified. Thus θ (1) (h) and θ (2) (h) can be obtained. By using θ (1) (h) and the
FOC, we can then retrieve c ′ (h) via G 1 .
2.2 Identification of The Extended Model
We now extend the model to include individual heterogeneity in leisure preference, productivity and
accuracy. Individual heterogeneity is thus described by a tuple (θL , θP , θ A ) where θL is the leisure preference parameter from the previous section, θP is a productivity parameter in individual i ’s production
function, and θ A is a parameter governing the accuracy with which individual i produces output. All
three parameters are exposed to treatment and selection effects due to CSR incentives. These effects,
along with the distribution of the parameters, are unobserved and to be estimated.
To fix ideas, consider a two-stage process in which a firm (1) hires workers and (2) lets them work
during a given period of time. The firm can incentivize its workers through the wage rate and through
a non-pecuniary incentive in the form of information about the firm’s investments in CSR. These incentives can be implemented both in a hiring stage and in a work stage.
In the hiring stage, each worker is offered a given wage rate and may or may not be exposed to
the non-pecuniary CSR incentive. We may think of this in terms of whether the firm openly advertises
its CSR efforts when recruiting workers. The primary question of interest is how information about
the firm’s investments in CSR attract workers who are inherently different in terms of productivity,
accuracy, and leisure preference than those in the general population. We will refer to this effect as the
selection effect of CSR.
Individuals who proceed to work for the firm are paid whatever hourly wage rate they were offered
in the hiring stage. Some of the individuals who begin to work for the firm were exposed to CSR
incentives in the hiring stage while others were not. Information about CSR may, however, also be
introduced any time during the work stage. This affects decisions at the intensive margin. One may
think of this in terms of the firm having employees working on different ”projects” where some projects
8
are associated with CSR and some are not. We will refer to the effect of work stage CSR incentives on
productivity, accuracy, and labor supply as the treatment effect of CSR. Of course, the treatment effect
may differ across subject who selected into the job under CSR incentives and those who selected in
under monetary incentives only.
When we observe agents working for several days, we may encounter a situation in which the
simple model described in the previous sub-section can be rejected. Since θ j i does not depend on t
for j ∈ {L, P, A}, each agent is predicted to work the same number of hours each day. The model would
hence be rejected if we were to observe an agent working a different number of hours during two
different periods. Moreover, the simple model does not account for unbalanced panel issues arising in
cases where agents only work during some of the possible days. In order to allow for more flexibility,
we let productivity, accuracy, and the agent’s time constraint be exposed to period-specific shocks. We
will now discuss the identification of the parameters associated with each agent’s leisure preference,
productivity, and accuracy type. In each case, we will separate the treatment effect from the selection
effect.
2.3 Leisure Preference
An agent’s leisure preference type affects how much labor the agent is willing to supply. This includes
the decision of whether to work at all during a given day. Agent i ’s period t leisure preference type is
assumed to be affected by CSR incentives in a multiplicative fashion:
F
X
X
F
θLi t = θLi βL00i (βFL0 ) X 0i D i βL11i t (βFL1 ) X 1i t D i u iLt
(5)
In this expression, θLi is the permanent, average leisure preference type for individual i . We assume
that θLi follows a distribution θLi ∼ FL (θLi , αL ) where αL , is a vector of parameters to be estimated. The
variables X 0i and X 1i are indicators for CSR incentives in the hiring and work task stage. D iF is a dummy
variable for whether individual i is female. The parameters βL0 and βL1 are selection and treatment
effects, and the parameters βFL0 and βFL1 are the female interaction selection and treatment effects. Note
that u iLt is a shock to the shadow value of time. The shock accounts for the fact that people may have
other things going on in their lives affecting how much time they are willing to spend working on the
task at hand. By introducing a shock to the time budget constraint, we let the model be flexible enough
so as to allow for cases in which a given agent works a different number of hours during different days.
Let w i denote the wage assigned to agent i and let hi t denote the number of hours worked by agent
i in period t . Each agent is assume to choose a number of hours h i t so as to maximize utility given by
income net of the utility cost function:
h ∗ = arg max{w i h −C i t (h; θLi t , αc )}
h≥0
The cost function is assumed to be separable into a idiosyncratic and a homogenous part:
C i t (h; θLi t , αc ) = θLi t c(h i t ; αc )
9
(6)
The idiosyncratic part is the leisure preference parameter already defined above. The homogenous
part is assumed to depend on a parameter vector αc . We assume that c(h; αc ) is stictly increasing and
strictly convex: c ′ (·; αc ) > 0, c ′′ (·; αc ) > 0. Note that this includes the special case of c ′ (0; αc ) > 0. For most
of what follows, we will suppress the dependency on the parameter vector αc whenever motivated
and simply write c(h).
For any given agent i and period t , one of two mutually exclusive cases can occur: we observe a
positive number of hours worked hi t > 0 or we observe zero hours worked hi t = 0. The FOC associated
with (6) will therefore satisfy
w i ≤ θLi t c ′ (h i t ),
i = 1, 2, ..., I ,
t = 1, 2, ..., T
The first case will help us identify the treatment effect while the second case will help us identify the
selection effect. We will treat each of these two cases separately. But before we proceed, we need to
establish identification of the cost function.
2.3.1
Identification of the Cost Function
In order to identify the cost function, we begin by limiting our focus to the subset of the sample for
which hi t > 0. Every individual-day observation is treated as if it were a unique person, in a setting
(k)
with no shocks, having fixed cost type θ̃m
, m = 1, 2, ..., M , were w k , k ∈ {1, 0} denotes the two different
wage rates and the number of day-individual combinations with hi t > 0 is M . Intuitively, θ̃ represents
all factors influencing a person’s leisure shadow value on a given day, including treatments and shocks,
but for the present stage all these terms are considered as one whole. The FOC implies
(k) k
(h m ) =
θ̃m
wk
k
′
c k (h m
; αc )
and the exclusion restrictions are
Assumption 1. The type distributions are invariant to wage treatment: FL(k) = FL , ∀k .
Assumption 2. The cost functions are invariant to wage treatment: c k = c, ∀k .
Again, these assumptions are motivated if the experimental design utilizes random assignment of
wages to subjects and the task is uniform across treatments. The equilibrium mapping between hours
worked across wage contracts for a given type can now be rewritten as
h 1 (˜(θ)) = G 1−1 [G 2 (h(θ̃))]
(7)
where G k is the distribution of hours worked under w k . Similarly, from the FOC, we obtain
θ̃1 (h) =
w1
θ̃(h)
w2
(8)
We now normalize the median of G 2 (h) so that θ̃2 (G −1 (0.5)) ≡ 1. It is possible to trace out (7) and (8) from
the normalized point and obtain θ̃1 (h) and θ̃2 (h). By assumptions 1 and 2, we can apply the results of
10
D’Haultfoeuille and Février (2015) and Torgovitsky (2015) and use the FOC for k = 2,
c ′ (h) =
w2
θ̃2 (h)
to identify the parameters of c(h; αc ).
2.3.2
Identification of Leisure Preference Treatment Effects
As the cost function has been retrieved, we can identify the treatment effect parameters. To do so, we
begin by assuming hi t > 0 so that agent i is actually observed working during period t . Then the FOC
holds with equality
w i = θiLt c ′ (h i t )
(9)
ln(w i ) − ln(c ′ (h i t )) − ln(θLi ) − X 0i ln(βL0 ) − X 0i D iF ln(βFL0 ) − X Li t ln(βLi ) − X Li t D iF ln(βFli ) = εLit
(10)
Taking logs of (9) gives us
and
(
Yi t ≡ ln
)
wi
= εLit + ln(θLi ) + X 0i ln(βL0 ) + X 0i D iF ln(βFL0 ) + X Li t ln(βLi ) + X Li t D iF ln(βFli )
c ′ (h i t )
Let
εCL i
=
εLN i
=
 ∑T εL 1(h >0 & X =1)
1i t
 ∑t =1T i t i t
if

0
otherwise
t =1 1(h i t >0 &
X 1i t =1)
∑T
t =1 1(h i t
 ∑T εL 1(h >0 & X =0)
1i t
 ∑t =1T i t i t
if

otherwise
t =1 1(h i t >0 & X 1i t =0)
0
∑T
t =1 1(h i t
> 0 & X 1i t = 1) > 0
> 0 & X 1i t = 0) > 0
Here we use the subscripts C and N to denote cases where there is a CSR incentive ( X 1i t = 1) and where
there is no such incentive ( X 1i t = 0). The selected distribution of εLit conditional on hi t is the same
regardless of treatment status since selection on the extensive margin only depends on X 0i for a given
θLi . It follows that
E [εCL i ] = E [εLN i ]
Now, let
Y Ci =
Y Ni =
(11)
 ∑T ( w i )
ln c ′ (h ) 1(h i t >0 & X 1i t =1)

 t =1 ∑
it
if

0
otherwise
T
t =1 1(h i t >0 &
X 1i t =1)
∑T
t =1 1(h i t
 ∑T ( wi )
ln c ′ (h ) 1(h i t >0 & X 1i t =0)

 t =1 ∑
it
if

0
otherwise
T
t =1 1(h i t >0 &
X 1i t =0)
∑T
t =1 1(h i t
> 0 & X 1i t = 1) > 0
> 0 & X 1i t = 0) > 0
and note that whenever
C≡
T
∑
t =1
1(h i t > 0 & X 1i t = 1) > 0
&
11
N≡
T
∑
t =1
1(h i t > 0 & X 1i t = 0) > 0
and ε̃Li ≡ εC i − εN i , we have
Y C i − Y N i = ln(βL1 ) + D iF ln(βFL1 ) + ε̃i
where
E [ε̃Li ] = 0
E [D iF ε̃Li ] = 0
F
For shorthand notation, let bL1 = ln(βL1 ) and bL1
= ln(βFL1 ). The baseline treatment effect βL1 and the
female-specific treatment effect βFL1 are identifiable and estimable via
]
[
β̂L1
β̂FL1
2.3.3
{
[
= exp arg min[bL1 ,b F
L1 ]
I (
∑
′
i =1
Y Ci − Y
F
F
N i − b Li − b L1 D i
)2 (
)
1 C > 0, N > 0
}]
(12)
Leisure Preference Selection Effects
We now turn to the selection effect on leisure preference. When hi t = 0, we have
w i < θLi t c ′ (0)
and
)
wi
− ln(θL ) − X 0i ln(βL0 ) − X 0i D iF ln(βFL0 ) < εLit
ln ′
c (0)
(
(13)
Let e i denote the LHS of (13) as it is the empirical analogue to εLit in the case of hi t = 0. We also define
ρ i as the probability of observing zero hours from individual i during a given day t :
ρ i = Pr[h i t = 0] = E [1(h i t = 0)]
(14)
Then we have
ρ i = Pr[e i < εLit ] = 1 − Φ(e i ) = Φ(−e i )
and
(
)
wi
Φ (ρ i ) = − ln ′
+ ln(θLi ) + X 0i ln(βL0 ) + X 0i D iF ln(βFL0 )
c (0)
−1
Now, let
(15)
∑I
ηlr j
=
F
i =1 ln(θLi ) × 1[(w i = j ) & D i = 1(l = F ) & X 0i = 1(r = C )]
,
∑I
F
i =1 1[(w i = j ) & D i = 1(l = F ) & X 0i = 1(r = C )]
∑I
ωlr j
=
i =1 Φ
−1
∑I
(ρ i ) × 1[(w i = j ) & D iF = 1(l = F ) & X 0i = 1(r = C )]
i =1 1[(w i
= j ) & D iF = 1(l = F ) & X 0i = 1(r = C )]
j ∈ {w 1 , w 2 },
r ∈ {C , N },
l ∈ {M , F }
From the random assignment of individuals to wage contracts, and from the exclusion restriction
E [θL |X 0 ] = E [θL ]
12
it follows that the selection effect parameters βL0 , βFL0 are identified via
[
[
]
]
M
M
E ωCMw 2 − ωM
N w 2 = E ηC w 2 − ηN w 2 + ln(βL0 ) = ln(βL0 )
(16)
[
]
[
]
M
M
E ωCMw 1 − ωM
=
E
η
−
η
+
ln(β
)
= ln(βL0 )
L0
N w1
C w1
N w1
(17)
[
]
[
]
E ωCF w 2 − ωFN w 2 = E ηCF w 2 − ηFN w 2 + ln(βL0 ) + ln(βFL0 ) = ln(βL0 ) + ln(βFL0 )
(18)
[
]
[
]
E ωCF w 1 − ωFN w 1 = E ηCF w 1 − ηFN w 1 + ln(βL0 ) + ln(βFL0 ) = ln(βL0 ) + ln(βFL0 )
(19)
Finally, we need to establish identification of the individual fixed effects, or in other words, the
sequence {θLi }iI =1 . But with c(·; α̂c ), β̂L1 , β̂FL1 , β̂L0 , and β̂FL0 known, {θLi }iI =1 . If hi t > 0, ∀i , t then {θLi }iI =1
could be retrieved directly from (10).
2.4 Productivity
Productivity determines the amount of time it takes to produce a unit of output. Let θP i t denote individual i ’s period t productivity type. Assume that θP i t is related to i ’s baseline productivity type θP i
in a multiplicative fashion:
F
X
F
X
θP i t = θP i βP 11i t (βFP 1 ) X 1i t D i βP 00i (βFP 0 ) X 0i D i
(20)
Also assume that θP i ∼ FP (θP i , αL ) where αL is a vector of parameters to be estimated. The parameters
βP 0 and βP 1 represent the treatment and selection effects of the CSR incentives on productivity. The two
additional parameters βFP 0 and βFP 1 account for gender differences in treatment and selection effects.
Let Q i t be the total number of units produced by i up until the end of period t so that Q i T is the total
number of units produced by i during the whole work period. The q th unit produced by i is simply
denoted by qi so that qi = 1, 2, ...Q i T .
2.4.1
Productivity Treatment Effects
The total time required for all output through unit q is given by production technology τt defined by
(
τt (q; θP i t ) =
q
θP i t
)δ
ũ qP
(21)
Taking logs of the production technology gives us
ln(τqi ) = δ ln(q i )−δ ln(θP i )−δ ln(βP 1 )X 1qi X 1qi −δ ln(βFP 1 )X 1qi D iF −δ ln(βP 0 )X 0i −δ ln(βFP 0 )X 0i D iF + ε̃Pqi (22)
Consider two subsequent units of output qi − 1 and qi and use (22) to form the log difference in
time required to produce each of them:
(
ln
τq i
τqi −1
)
(
)
qi
= δ ln
− δ ln(βP 1 )(X 1qi − X 1(qi −1) ) − δ ln(βP 1 )(X 1qi − X 1(qi −1) )D iF + εPqi
qi − 1
13
(23)
Since
E [εPqi ] = 0,
E [εPqi q i ] = 0,
E [εPqi X 1qi ] = 0,
E [εPqi D iF ] = 0
the parameters of (23) are identified. Now, define
P
ε̃C i
=
P
ε̃N i
Y
P
ji
=
=
 ∑Q
T
ε̃ 1(X =1)


 qi =1 qi 1qi
if


0
otherwise
∑Q i T
q i =1 1(X 1q i
=1)
 ∑Q
T
ε̃ 1(X =0)


 qi =1 qi 1qi
if
∑Q i T
qi


0
=1 1(X 1q i =1)
∑Q i T
q i =1 1(X 1q i
∑Q i T
q i =0 1(X 1q i
= 0) > 0
= 0) > 0
otherwise
 ∑Q
T

 qi =1 [ln(τqi )−δ ln(qi )]1[X 1qi =1( j =C )]

if


0
otherwise
∑Q T
q i =1 1[X 1q i
=1( j =C )]
∑Q T
q i =1 1[X 1q i
= 1( j = C )]
j = C,N
Since the error ε̃Pqi is independent of treatment status, it follows that whenever
QT
∑
q i =1
then
1(X 1qi = 1) > 0
P
&
QT
∑
q i =1
1(X 1qi = 0) > 0
P
Y Ni − Y Ci
δ
P
= ln(βP 1 ) + ln(βFP 1 )D iF + ε̃i
where
P
E [ε̃i ] = 0,
P
E [ε̃i D iF ] = 0
Hence, the treatment effects are identifiable and estimable via
[
β̂P 1
β̂FP 1
]
[
{
= exp arg min[bP 1 ,b F
P1]
I (
∑
′
i =1
P
Y Ci
−Y
P
F
F
N i − bP i − bP 1 D i
)2
(
1
Q
it
∑
q i =1
1(X 1qi = 1) > 0 &
Q
it
∑
q i =1
)}]
1(X 1qi = 0) > 0
(24)
The learning parameter δ is estimated by
)
(
))2
( (
τq i
qi
δ̂ = arg min
− δ ln
1(X 1qi = X i (qi −1) )
ln
δ i =1 q =2
τqi −1
qi − 1
i
Qi
I ∑
∑
2.4.2
Productivity Selection Effects
For the productivity selection effects, let
∑I
ωPl
r
=
i =1
∑Q i T
F
F
F
q i =0 [ln(τq i ) − δ ln(q i ) − δ ln(βP 1 ) − δ ln(βP i X 1q i D i )] × 1[D i = 1(l
∑I ∑Q i T
F
i =1 q i =1 1[D i = 1(l = F ) & X 0i = 1(r = C )]
14
= F ) & X 0i = 1(r = C )]
l = M,F
r = C,N
∑I
ηPl
r
=
F
i =1 ln(θP i ) × 1[D i = 1(l = F ) & X 0i = 1(r = C )]
∑I
F
i =1 1[D i = 1(l = F ) & X 0i = 1(r = C )]
l = M,F
r = C,N
From random assignment of individuals to wage contracts, and since
E [θP i |X 0i , D iF ] = E [θP i ]
it follows that
Pl
E [ωPNM − ωCP M ] = E [(ηPl
N − ηC )δ + ln(βP 0 )δ] = δ ln(βP 0 )
and
E [ωPNF − ωCP F ] = E [(ηPNF − ηCP F )δ + (ln(βP 0 ) + ln(βFP 0 ))δ] = δ(ln(βP 0 ) + ln(βFP 0 ))
Hence, βP 0 and βFP 0 are identified and estimable via
[
β̂P 0 = exp
[
β̂FP 0 = exp
ωPNM − ωCP M
]
δ
(ωPNF − ωCP F ) − (ωPNM − ωCP M )
]
δ
2.5 Accuracy
Accuracy measures whether or not unit q in period t was correctly entered by i . For our purposes,
higher accuracy directly translates to greater output quality.2 To fix ideas, we define a latent accuracy
∗
index a qi
as
∗
a qi
= Θ Aqi + εqi
(25)
where εqi follows a known symmetric distribution Φ(·). The accuracy parameter contains both permanent characteristics and treatment effects
Θ Aqi = Θ Ai + Tqi
where the fixed component is
Θ Aqi = θ Ai + β A0 X 0i + βFA0 X 0i D iF
2 In this subsection, we outline a version of the accuracy parameter identification without a learning effect. For a version
with a learning effect, see Appendix A. Introducing a learning effect makes the model considerably less computationally
tractable
15
and the time-varying treatment component is
Tqi = β A1 X 1qi + βFA1 X 1qi D iF .
As before, X 1qi represents the treatment that was in effect when i produced her q t h unit of output,
and may vary across different units produced on different days. We do not observe the latent accuracy
∗
index a qi
directly. Instead, we observe a binary a qi ∈ {0, 1} where
a qi

1
=
0
∗
if a qi
>0
otherwise
This implies
∗
Pr[a qi = 1|Θ Aqi ] = Pr[a qi
> 1|Θ Aqi ] = Pr[εqi > −Θ Aqi ] = 1 − Φ(Θ Aqi ) = Φ(Θ Aqi ).
As in the case of leisure preference and productivity, we will construct a differencing estimator.
In this case, however, we need to make adjustments to cope with the non-linearity in parameters introduced by the function Φ(·). We begin with a difference equation for the treatment effects β A1 and
βFA1 . We then derive the estimator for the selection effects β A0 and βFA0 , and then finally we back out
individual fixed effects θ Ai , given known values for treatments and selection.
2.5.1
Accuracy Treatment Effects
Note that for all of i ’s units produced under the same task framing, the treatment effect is held constant.
Let i ’s within-treatment accuracy score be defined as
Y
where Q N i ≡
∑Q i
A
ji
=
 ∑Q
i

 qi =1 1[A qi =1 & X 1qi =1( j =C )]
if

0
otherwise
q i =1 1(X 1q i
Q ji
= 0) and QC i ≡
∑Q i
q i =1 1(X 1q i
∑Q i
q i =1 1[X 1q i
= 1( j = C )] > 0
j = C,N,
= 1) are the totals of output units by i under task
framing j = C , N . Whenever both Q N i > 0 and QC i > 0 we have
(
)
Φ−1 Pr[A qi = 1|X 1qi = 1] = Θ Ai + β A1 + βFA1 D iF
and
(
)
Φ−1 Pr[A qi = 1|X 1qi = 0] = Θ Ai .
If we substitute the empirical analogs of the left-hand sides of the equations above and difference them
we get
[
( A )
( A )
]
E Φ−1 Y C i + BCC i − Φ−1 Y N i − BC N i = β A1 + β A1 + D iF
Where the expectation is taken across individuals i and the BC j i , j = C , N are known bias correction
terms (which we define below) to adjust for finite sample variability of the empirical accuracy scores
Y
A
ji .
This establishes that the accuracy treatment effects are identified and estimable via a simple least
16
squares routine:
[
β̂ A1
β̂FA1
2.5.2
]
{
= arg min
′
[β A1 ,βFA1 ]
I (
∑
i =1
Φ
−1
}
)2 [
[ A ]
[ A ]
]
F
F
Y C i + BCC i − Y N i − BC N i − β A1 − β A1 D i 1 (QC i > 0) & (Q N i > 0)
(26)
Accuracy Selection Effects
At this point we can consider the treatment effects as known objects and we now turn to selection
which will involve difference equations across individuals in the two hiring stage treatments. Let ωrAlj i
denote the empirical accuracy score for a person of gender l , hiring treatment group r , on days where
task framing treatment j is in effect. Specifically, this is defined as
ωrAlj i ≡
[
]
∑
Qi
F

1
(A
=1)
&
(D
=1[l
=F
])
&
(X
=1[r
=C
])
&
(X
=1[
j
=C
])
qi
0i
1q i
 qi =1
i

Q rl j i


0
where Q rl j i ≡
if Q rl j i > 0
l = F, M ; r = C , N ; j = C , N ,
otherwise
∑Q i
q i =1 1
]
[
(D iF = 1[l = F ]) & (X 0i = 1[r = C ]) & (X 1qi = 1[ j = C ]) is the total number of units
produced by person i , given his/her gender and hiring treatment, under task framing treatment j . By
similar logic of above, it follows that
[
]
(
)
E Φ−1 [ωlr j i ] + BC rl j i + 1[ j = C ] β A1 + βFA1 1[l = F ]
(
)
= θ Ai + 1[r = C ] β A0 + βFA0 1[l = F ] , l = F, M ; r = C , N ; j = C , N ,
(27)
where BC rl j i is a bias correction term as above. Since gender l and hiring treatment r are fixed, the
above expression gives us up to two equations for each individual i . Note also that for each individual,
the difference of these two equations leaves only the selection effect. However, before differencing we
aggregate across all individuals to get
]
]
[ [
(
)] [ l
Al
l
F
+
BC
−
1
[
j
=
C
]
β
+
β
1
[l
=
F
]
1
Q
>
0
ω
Φ
A1
r ji
i =1
A1
r ji
r ji
[
]
.
∑I
l
1
Q
>
0
i =1
r ji
∑I
Al
Ωr j ≡
Now, since E[θ A |X 0 ] = E[θ A ] by definition because the selection effect is just the mean difference in
permanent types, then by random assignment we have the following relationships:
[ AF
]
AF
E ΩCC − ΩNC = β A0 + βFA0 ,
[ AF
]
AF
E ΩC N − ΩN N = β A0 + βFA0 ,
[ AM
]
AM
E ΩCC − ΩNC = β A0 , and
]
[ AM
AM
E ΩC N − ΩN N = β A0 .
(28)
These equations establish that the selection effects are identified and estimable via a simple least squares
17
routine
{[
]2 [ AM
]2 }
AM
AM
AM
β̂ A0 = arg min ΩCC − ΩNC − β A0 + ΩC N − ΩN N − β A0
β A0
β̂FA0 = arg min
βFA0
2.5.3
{[(
) ( AM
)
]2 [( AF
) ( AM
)
]2 }
AF
AF
AM
AF
AM
ΩCC − ΩNC − ΩCC − ΩNC − βFA0 + ΩC N − ΩN N − ΩC N − ΩN N − βFA0
.
(29)
Individual Fixed Effects
Finally, with treatment and selection parameters known, it is easy to see why θ Ai is identified, since we
observe individual i across multiple units of production. Moreover, we can estimate fixed effects for
individual i with gender l and hiring treatment j using a simple weighted average
[
θ̂ Ai =
Q rl C i
Q rl C i +Q rl N i
+
( [
]
)
(
)
Φ ωrAlC i + BC rl C i − 1[r = C ] β̂ A0 + β̂FA0 1[l = F ] − β̂ A1 − β̂FA1 1[l = F ]
Q rl N i
Q rl C i +Q rl N i
( [
]
)]
(
)
Φ ωrAlN i + BC rl N i − 1[r = C ] β̂ A0 + β̂FA0 1[l = F ] − β̂ A1 − β̂FA1 1[l = F ] , i = 1, . . . , I .
(30)
2.5.4
Bias Correction
The source of bias from finite sampling above comes from the fact that
( [
])
[
(
)]
Φ−1 E ωrAlj i ̸= E Φ−1 ωrAlj i
due to the non-linearity in the quantile function Φ−1 . In particular, under a probit model (standard
normal shocks) there will be a bias toward the median for finite Q i since the normal CDF is convex
below the median and concave above it. However, one can use the known functional form of Φ and
the sampling distribution of ωrAlj i to correct for the bias.
Note that the empirical accuracy scores are sample means from Bernoulli random variables, meaning that the central limit theorem provides us with a sampling distribution theory. Specifically, if
∑N
N
{w n }n=1
w n /N
is an iid sample of Bernoulli observations with mean ρ , then the sample mean ρ̂ = W = n=1
p
is (approximately) normal with mean ρ and standard deviation ρ(1 − ρ)/ N . Using this information,
one can compute the expectation
[
]
E Φ−1 (ρ̂) =
∫
1
0
(
)
ρ(1 − ρ)
Φ (t ; 0, 1)φ t ; ρ̂, p
dt,
N
−1
where Φ(·; µ, σ) and φ(·; µ, σ) are the normal CDF and PDF, respectively.
2.6 Estimation
Our estimation strategy is to specify the functional form of the cost function c(·) and the distributions
F j as B-spline functions. B-splines are convenient tools for parametric curve fitting since they can be
18
constructed to fit any continuous curve to arbitrary precision (see Hickman et al. 2016). But before we
proceed, we need to clarify some technical hurdles concerning the estimation method and the distributions F j . In order to define a B-spline basis function, it is necessary to start with a fixed knot vector
of K j + 1 points spanning the relevant compact domain for F j , j ∈ {P, A, L}. But this will be problematic
since we do not know the upper and lower bounds of θ j . To circumvent this problem, we will instead
parametrize and estimate the quantile functions Q j associated with each F j . This is possible since we
know that the compact domain of Q j is [0, 1]. Our B-spline basis function for Q j is defined as follows:
Q j (r ; α j ) = F j−1 (r ; α j ) =
K∑
j +3
k=1
j
α j k Bk (r ),
j ∈ {L, P, A}
where B j k : [0, 1] → R is the B-spline basis function determined by the knot vector k j , j ∈ {L, P, A}. The
cost function is parametrized in a similar fashion:
c(h; αc ) =
K∑
c +3
k=1
αck Bkc (h)
where Bck : [0, max(hi t )] → R. But in order for these functions to be valid CDFs and cost functions,
we need to impose a number of shape restrictions. In particular, for each j ∈ {P, A, L}, we impose the
following condition:
Q ′j (r ; α j ) > 0,
∀r ∈ (0, 1)
c ′ (h; αc ) > 0,
∀h ∈ (0, 1)
c ′′ (h; αc ) > 0,
∀h ∈ (0, 1)
and for the cost function
c(0; αc ) = 0
In other words, both the quantile function and the cost function are monotone. By construction, a
B-spline function is monotone if and only if its parameters are ordered monotonically. This implies:
α j k ≤ α j (k+1) ,
∀k = 1, 2, ..., K j − 2, j ∈ {L, P, A}
αck ≤ αc(k+1) ,
∀k = 1, 2, ..., K c − 2
The above conditions also imply convexity of the cost function. To see this, note that convexity of the
B-spline function is imposed by
α2ck ≤ α2c(k+1) ,
∀k = 1, 2, ..., K c − 2
and the boundary condition is imposed by
αc1 = 0
19
Finally, note that any B-spline function
Q(x; α j ) =
K
∑
αk B(x)
k=1
can be expressed as
Q(x; α) = B(x)α
where the k ’th column of B(x) is Bk (x). Similarly, Q ′ (x; α) = B ′ (x)α where the k th column of B ′ (x) is
Bk′ (x). In practice, the above conditions are imposed as constraints on a non-linear solver.
The next step is to obtain the moment conditions necessary for pinning down the quantile function
parameters. First note that for an ordered sample of observations {x (1) ≤ x (2) ≤ ... ≤ x (I ) } from a distribution F X (x) with density f X (x) and quantile function Q X (r ), r ∈ (0, 1), the density and distribution of x (i )
is given by
(
)
[
]I −i
I −1
f X (i ) (x i ) = I
F X (x (i ) )i −1 1 − F X (x (i ) )
f X (x (i ) )
i −1
Now, suppose we have a random sample {x i }iI =1 generated from some arbitrary distribution F X (x).
We could generate a corresponding set of quantile ranks {r i }iI =1 through the transformation r i = F X (x i )
or x i = Q X (r i ), and note that R ∼ U (0, 1). If we order the observations, we get a set of order statistics
{x (1) ≤ x (2) ≤ ... ≤ x (I ) } and {r (1) ≤ r (2) ≤ ... ≤ r (I ) } with x (i ) = Q X (r (i ) ) by monotonicity of Q X (·). From above,
we know that
E [r (i ) ] =
and the density of R (i ) is
i
I +1
(
)
I − 1 i −1
f R(1) (t ) = I
t (1 − t )I −i
i −1
Therefore, we can conclude that
∫
E [X (i ) ] =
1
0
(
I −1
Q X (t ) f R(i ) (t )d t = I
i −1
)∫
1
0
Q X (t )t i −1 (1 − t )I −i d t ,
i = 1, 2, ..., I
This fact gives us a basis for moment conditions to pin down αP , α A , and αL . Namely, for an ordered
sample θ j (1) ≤ θ j (2) ≤ ... ≤ θ j (I ) we have
(
I −1
E [θ j (i )] = I
i −1
)∫
1
0
Q j (t , α j )t i −1 (1 − t )I −i d t ,
∀i = 1, 2, ..., I ,
j ∈ {P, A, L}
(31)
A final hurdle comes from the fact that we need a way of computing an empirical analog to the lefthand side of (31) that aggregates information from all observations to get a consistent estimator of
E [θ j (i )]. It turns out that it is possible to analytically compute the asymptotic value of
[ ] 1∑
S
à
j
j
θ̃
E θ(i ) =
S s=1 s(i )
20
j
j
j
j
j
j
given a sample of simulated order statistics {θ̃s(1) , θ̃s(2) , ..., θ̃s(I ) } generated by I draws from {θ1 , θ2 , ..., θI }.
j
j
This is accomplished by computing P km = Pr(θ̃s(m) = θ(k) ) for a given simulated sample S so that
I
∑
à
j
lim E [θ(m) ] =
P km θ j (k)
S→∞
k=1
The method for doing this is outlined in Appendix B.
3
Experimental Design
We design a natural field experiment (Harrison and List 2004) to generate the data necessary for estimating the model. The experiment uses randomized treatments administered in two stages: (1) an
hiring stage and (2) a real effort work stage. While the primary purpose of the hiring stage is to investigate the effect of CSR on the extensive margin, the focus of the work stage is the intensive margin.
The two stages are, however, linked in the sense that any subject who participates in the work stage
was also observed in the hiring stage. This approach allows us to investigate whether work stage
productivity depends on hiring stage CSR selection.
In the first stage, we hired subjects while varying wage and non-pecuniary treatment conditions.
The second stage real effort task had subjects working with a simple data entry task through an
online task system. The task system allows us to obtain detailed measures of output and labor input
during each day of the experiment. Working subjects are assigned to treatments on a daily basis, which
gives us both within and between subject variation.
The experiment was designed with several important conditions in mind. First, we need randomized assignment to treatments in both stages. This is necessary to identify the model as the model
implies that potential employees need to be offered different wage rates and given various degree of
information about CSR. Second, we need to observe the same test subjects in both stages. (More precisely, at least a subset of the subjects who participated in the application stage need to be observed
in the work task stage). This is crucial for separating treatment and selection effects. Third, we need
to achieve as natural an environment as possible for the subjects to review the job application and to
work. Fourth, the work task need to be uniform across treatments so as to allow for identification of
the cost function.
In order to satisfy all these conditions, we started an actual consultancy firm. The firm’s name
is HHL Solutions, LLC and specializes in data collection and data management services for various
clients. A website for the firm was also designed and published online. (See Appendix C for a screenshot from the firm website. The website did not play a direct part in the experiment but was set up in
cases potential subjects began to search for the firm on the web.)
Before we proceed, we discuss the necessity of actually running our own firm rather than partnering
with an existing firm or using a crowdsourcing market such as for example mTurk. Since we want to
investigate how information about wages and non-pecuniary factors affect decisions at the extensive
margin, we need to hire a relatively large number of individuals for a short term job. This imposes a
restriction on the set of firms with which we can partner. The empirical estimation procedure requires
21
detailed measurements of labor supply, output quantity, and output quality. This, in turn, necessitates
the work task to be relatively simple and uniform across workers, further limiting the possibility of
collaboration with an actual firm.
The reason for starting a firm rather than using internet crowdsourcing marketplace is twofold.
First, the subject pool in a crowdsourcing market is likely to exhibit certain characteristics that would
be undesirable for our purposes. For example, subjects recruited via mTurk have a tendency to pay
less information to experimental materials and have a different attitude towards money than subjects
recruited from other populations (see Chandler et al. 2014; Goodman et al. 2013; Paolacci and Chandler
2014). This would be problematic in our case since the attitude towards pecuniary and non-pecuniary
incentives is central to our research question, and our statistical power depends on subjects reading
through the materials. Second, the manner in which we assign subjects to treatments in the second
stage, and the degree of details in terms of output measures, require a large degree of flexibility in the
online task system. This would make it quite difficult to operationalize the experiment on a crowdsourcing market. We now turn to a more patient discussion of each experimental stage.
3.1 Hiring Stage
Employment advertisements were placed in 12 cities throughout the US on Craigslist.com. The particular cities were chosen based on size, geography, and the relative activity of the associated Craigslist
page. The ad described a short term employment opportunity in general terms and provided some
basic information about the employer. It was emphasized that the job consisted of a simple data entry
task that requires no specific previous experience and that the position is temporary, lasting only a
few weeks. The ad also stated that the wage would be somewhere in the range of $11-$15. Interested
individuals were encouraged to request more information by replying to the ad via email. Individuals
who responded to the ad were subsequently randomized into treatment groups. These subjects were
sent an additional email containing a wage offer and a more detailed description of the firm and the
job.
The first stage treatments were administered via this email. Importantly, all subjects, regardless of
treatment, were exposed to the same original job advertisement. This permits us to randomize subjects
to treatments in a controlled manner. In contrast, if we were to use more than one version of the ad,
there would be no way of controlling which ad each subject was exposed to and whether or not a given
subject was exposed to one ad.3 The separation between the initial response and the implementation
of the treatment also allowed us to balance the treatment cells based on gender. While we are not
allowed to explicitly ask an individual about their gender, we can predict the gender based on the
name. In particular, we used probabilities derived from the Social Security Administration database
name popularity. The timing of the first stage of the experiment is summarized in figure 1.
We used two different classes of treatments in the first stage: (1) two different wage levels, $11 and
3 This element of the design is similar to the one used by Flory et al. (2014) and Leibbrandt and List (2014) to study
gender differences in attitudes towards competition and wage negotiations. A similar ”two-step” randomization procedure
was used by Ashraf et al. (2010) to separate screening and sunk-cost effects in a field experiment on how to charge for health
care products in developing countries.
22
Figure 1: Hiring stage timing
$15, and (2) two different texts containing different amounts of information about the firm’s involvement in CSR. In terms of the notation from the model, the wage rate for subject i is represented by
w i ∈ {$11, $15}. The CSR information is represented by the dummy variable X 0t defined by
X 0i

1
=
0
ifsubject i informed about CSR in job description
otherwise
for i = 1, 2, ..., I . The wage and CSR treatments were crossed with each other, resulting in 4 treatment
cells. The treatment cells are summarized in table 1.
X 0i = 0
X 0i = 1
w i = $11
11 / Neutral
11 / CSR
w i = $15
15 / Neutral
15 / CSR
Table 1: Hiring stage treatments
The ”Neutral” and ”CSR” treatments warrant some further clarification. In the two treatment cells
with X 0i = 0, the task information email contained following information:
”The position is focused on data entry tasks for a number of our clients. We provide
services for a variety of different firms and organizations.”
The rest of the email specified the wage offer and outlined the task in more detail. In the two ”CSR”
treatment cells with X 0i = 1, there were more information about the firm’s CSR endeavors:
”The position is focused on data entry tasks for a number of our clients. We provide
services for a variety of different firms and organizations. Some of them work in the
nonprofit sector with various charitable causes. For example with projects aimed at
improving access to education for underprivileged children. We believe that these
organizations are making the world a better place and we want to help them doing
so. Due to the charitable nature of their activity, we only charge these clients at cost.”
In order to avoid deception, we made sure to actually have at least one of each type clients: one
for-profit firm and one non-profit organization. These particular clients were Uber Technologies, Inc.
and school districts, such as Chicago Heights. The remaining part of the email was identical to the one
of the Neutral treatment cells. The email ended by asking subjects to formally apply for the position—
23
if they are still interested after having been offered a wage rate and given more information about
the task—by submitting their resume. After viewing the additional information provided in either
treatment group, the subject makes the decision whether or not to apply for the job. This provides us
with our first outcome variable: the application decision. By computing the share of the initial pool
of subjects (the ones that responded to the original ad) that actually apply, we can estimate how the
wage rate (pecuniary incentive) and the information about CSR (non-pecuniary incentive) affect the
probability of applying. The application decision is an important outcome variable as it constitutes
the extensive margin effect of our treatments. But the two treatment groups used in the first stage
also give us two distinct groups of individuals entering the second stage of the experiment: those
who applied after having been exposed to either treatment. In terms of the model, we obtain a vector
X 0 = (X 01 , X 02 , , ..., X 0I ) where each element contains the X 0i dummy for each agent. Since there is a
selection effect on (θiP , θiA , θiL ), across-type differences will be important in the analysis of the treatment
effects in the work stage stage.
3.2 Work Stage
3.2.1
The Task
A subset of those who applied for the job in the hiring stage was hired to participate in the work stage.
Subjects were informed that they would be working for 10 days and that they were free to work any
number of hours during the project period. In terms of the model, one day corresponds to a ”period”
t so that t = 1, 2, ..., T in the experiment. Each subject worked from home through our custom-designed
online task system. They were each provided with a username and a password for logging into the
system at any time. The system allows us to track the number of hours worked by each subject each
day, the quantity of output produced, and the quality of output.
Upon logging into the system, subjects saw a dialog box explaining the task system itself and some
information about the client associated with the work for the particular day. After reading this information, subjects were taken to the ”main screen”. Here, subjects were faced with a list of tasks that
could be completed in any order. A task was initiated by clicking on the associated link taking the
subject to a data entry screen.
On the data entry screen, the subject is then presented with a snapshot from Google Streetview depicting a street in a major city in the US. The snapshot was sampled (without replacement) from a
large database of pictures collected prior to the experiment. Below the snapshot was a web form with
a number of text boxes and drop-down menus for entering data. The subject’s task was to complete the
form using information contained in the snapshot. This implied observing in the picture and correctly
entering this information for each variable listed.
In total, there were 6 meta-data variables about the picture itself and 12 variables related to the
street visible in the picture. Some of the variables require the subject to make an assessment of the
quality of the street or the buildings visible in the picture and assign a rating on a Likert scale. Other
variables required the subject to count the number of occurrences of certain objects in the picture (e.g.
potholes and broken windows). Once all the information is recorded, the subject submits the task by
24
clicking on a button, taking him or her back to the main screen. The completed tasks were then marked
as ”completed” in column in the middle of the main screen. Time was automatically tracked provided
the subject was logged into the system. The subject could view the amount of active working time
(recorded in minutes) on the main screen at any time. If the subject logs out they were once again
presented with the task description when logging back in.
The task was chosen so both our private clients and our non-profit clients could gain valuable
information. In this way, we can convey work done for each client without deceiving workers.
3.2.2
Work Stage Treatments
Since w i does not depend on t in the model, a given subject faced the same wage rate every day for the
duration of the experiment. The information about the nature of the ”client” associated with the task of
the particular day was, however, varied across treatments. Information about the client was presented
to subjects via the information text displayed when logging into the system and in information boxes
on the main screen. There were two different types of clients: ”for-profit clients” not associated with
CSR and ”non-profit” clients associated with CSR. It was emphasized that the latter type of client works
with improving access to education for underprivileged children and that our firm want to help these
clients making the world a better place. As a consequence, our firm is only charging these clients at
cost. This was our instrument for CSR in the work stage. (See Appendix C for sample screenshots from
the task and main screens.)
Subjects were randomized into a treatment at the beginning of each day. (At 12:00 am local time.)
This implies both within and between subject variation in the second stage. In terms of the model
variables, the work stage treatments are represented by the X 1i t dummy variable defined as
X 1i t

1,
=
0,
ifsubject i associated with CSR client during day t
otherwise
for i = 1, 2, ..., I and t = 1, 2, ..., 10.
3.2.3
Work Stage Outcome Variables
There are three principal outcome variables in the second stage: (1) number of hours worked, (2) quantity of output, and (3) quality of output. Quality of output is defined as the share of items correctly
submitted. The number of hours worked hi t is simply the number of hours subject i spent logged into
the system during day t . It is important to note that each subject is allowed to decide for himself how
many hours to work each day. Our time tracking system has a built-in mechanism so that subjects will
be automatically logged out (and paid time halted) if he or she stays inactive for longer than a specified
interval of time. This prevents subjects from ”gaming the system” by logging in and earning money
without actually working, while still allowing for some idle time. The automatic log-out mechanism
also imply that we do not need to distinguish time spent on the data entry screen from time spent on
the main screen.
25
We use two different measures of productivity. First, the familiar definition of productivity, output
per unit of time qi t /hi t will primarily be of interest in the reduced form analysis. Second, we obtain a
measure of τqi t by computing the accumulated amount of time it takes for a given subject to produce
up until the q th unit of output. This measure will be used to identify the productivity parameters of the
model. Finally, we obtain our measure of accuracy a qi t as whether or not unit q was correctly entered
by subject i for task q in time t . A unit is correctly entered if what the subject reported as being in the
Streetview snapshot is actually true. (This is verified ex post.) For Likert scale variables, we rely on the
”wisdom of the crowds” and the fact that each each individual picture was viewed by more than one
subject. If a subject’s response for a given item and picture was more than two points away from the
mean response for those individual also who worked on that same picture.
4
Reduced Form Results
This subsection presents the reduced-form results from the experiment. We will begin by discussing
the results from the hiring stage. The primary question of interest will be how different wage rates
and non-pecuniary incentives provided by information about CSR affect labor supply at the extensive
margin. In other words, how does information about investments in CSR affect the probability that
a given individual applies for the job? We then present the results from the work task stage of the
experiment. Here, we are interested in how wages and information about CSR affect labor supply
and output and whether the results differ across hiring-stage treatment groups. This will allow us to
address the question of whether, for example, subjects who were informed about CSR in the hiring
stage of the experiment behave differently in the work stage.
4.1 Hiring Stage: Extensive Margin
The recruitment stage of the experiment was conducted during the summer of 2016.4 The cities included in the experiment were Austin, Baltimore, Boston, Dallas, Houston, Indianapolis, Jacksonville,
Los Angeles, New York City, Philadelphia, Phoenix, and San Diego. A total of N=1103 job offers were
sent out via email to individuals who had expressed interest in the job. These individuals were randomly assigned to the four treatment cells: (1) $11 wage rate and no information about CSR (n=266),
(2) $11 wage rate and information about CSR (n=309), (3) $15 wage rate and no information about CSR
(n=277), and (4) $15 wage rate and information about CSR (n=251). Figure 2 shows the mean application rates, i.e. the share of subjects proceeding to apply for the job, in each treatment cell across the
whole sample.
Including information about CSR in the job description increases the application rate from 0.301 to
0.382 for subjects offered a $11 wage rate (Mann-Whitney U test: p=0.021). The corresponding change
for subjects with a $15 wage rate offer is from 0.408 to 0.494 (MWU: p=0.024). The effect of the wage
is only slightly larger. Upon changing the wage rate from $11 to $15, application rates among subjects
exposed to the neutral information increases from 0.301 to 0.408 (MWU: p=0.009). For subjects with
4 A smaller-scale pilot experiment was conducted in April 2016.
26
Share applying for the job
0.6
0.5
0.494
0.4
0.408
0.382
0.3
0.301
0.2
0.1
0.0
$11/Neutral
(n=266)
$11/CSR
(n=309)
$15/Neutral
(n=277)
$15/CSR
(n=251)
Figure 2: Application rates across treatments: all subjects. (Error bars show 5% confidence intervals.)
information about CSR, the corresponding increase is from 0.382 to 0.494 (MWU: p=0.008). There is no
statistically significant difference in application rates between subjects with a $11 wage combined with
information about CSR (0.382) and those with a $15 wage and information about CSR (0.408). We conclude that information about CSR does indeed increase the probability that a given individual applies
for the job. The effect size is about 8.5 percentage points. In comparison, the increase in application
rates due to change in the wage rate from $11 to $15 is roughly 11.5 percentage points. This implies
that revealing information about CSR induces roughly the same response in application rates as an
increase in the wage rate with about 30 percent. Robustness checks indicate no statistically significant
difference in these results across cities. It is worth noting that the observed application rates and wage
elasticities are roughly similar to those found in comparable studies (see for example Flory et al. 2014;
Leibbrandt and List 2014).
It is possible that men and women respond differently to wage and CSR incentives. To account for
this possibility, we split the sample into male and a female subsamples. Figure 3 shows the application
rates across treatments for women only. The first thing to note is that both the wage rate and the CSR
information has a smaller effect in the female subsample than in the full sample. Increasing the wage
rate from $11 to $15 increases the application rate by about 9 percentage points without information
about CSR (MWU: p = 0.057) and with about 10 percentage points in the CSR treatments (MWU: p
= 0.023). The CSR information increases application rates by rougly 5.5 percentage points in the $11
treatments (MWU: p=0.124) and by 7.8 percentage points in the $15 treatments (MWU: p=0.069).
Figure 4 shows the corresponding application rates for the male subsample. It is clear that men
show a greater response to both wage and CSR incentives. The effect of an increase in the wage from $11
to $15 range from 14 percentage points in the CSR treatments (MWU: p=0.032) to about 18 percentage
27
Share applying for the job
0.6
0.5
0.494
0.4
0.3
0.337
0.393
0.416
$11/CSR
$15/Neutral
0.2
0.1
0.0
$11/Neutral
(n=181)
(n=224)
(n=197)
$15/CSR
(n=168)
Figure 3: Application rates across treatments: women. (Error bars show 5% confidence intervals.)
Share applying for the job
0.6
0.5
0.500
0.4
0.3
0.2
0.392
0.357
0.214
0.1
0.0
$11/Neutral
(n=84)
$11/CSR
(n=84)
$15/Neutral
(n=79)
$15/CSR
(n=80)
Figure 4: Application rates across treatments: men. (Error bars show 5% confidence intervals.)
points in the neutral treatment (MWU: p=0.007). Men also show a relatively large response to the CSR
incentive; when advertising CSR, the application rate for men increases by 14.3 percentage points for
the $11 wage (MWU: p=0.020) and by 10.8 percentage points for the $15 wage (MWU: p=0.087).
In order to summarize the results at the extensive margin, we conclude that an increase in the wage
rate from $11 to $15 increases application rates with about 33 percent. Interestingly enough, the effect
28
of revealing information about CSR is almost as large: around 25-26 percent. When comparing the
application patterns between men and women, we find that on average, me are more price sensitive
than women. This is consistent with results in the literature on charitable fundraising. Men are more
likely to give to charity relative to women when the price of giving decreases (List 2011). We also find
that men are more elastic than women with regards to the CSR incentive. This finding runs contrary to
studies indicating that women are more responsive to social cues than men (Croson and Gneezy 2009).
One possible explanation for this result is that data entry jobs predominantly are performed by women
(our sample contains a much larger number of women than men) and the men who to select into the
pool of possible applicants differ in their attitudes towards social cues such as CSR relative to the male
population as a whole.
Note that so far, we have not said anything about selection effects. Our results indicate that in
certain cases, firms can increase the pool of applicants by roughly the same magnitude as the increase
associated with a 30 percent wage increase by simply revealing information about CSR. Apart from the
obvious fact that a firm might not gain from undertaking costly CSR investments just to expand the
pool of applicants, there are more subtle ways in which non-pecuniary incentives like CSR information
can backfire. In particular, the wage and CSR incentives may not attract the same types of employees;
those attracted by a higher wage may differ from those attracted by CSR along important dimensions
such as productivity, accuracy, and leisure preference. The results from the work task stage of the
5
5
4
4
3
3
Non-CSR tasks
CSR tasks
.
Hours
experiment will shed light on whether this the case.
2
2
1
1
0
$11
$15
0
(a) Mean hours worked by wage
$11
$15
(b) Mean hours by wage and task type
Figure 5: Mean hours worked by wage rate and task type in pooled data. (Pooling men and women of
both CSR and non-CSR types.) Error bars show 5% confidence intervals.
4.2 Work Stage: Intensive Margin
There are two different types of subjects entering the work task stage of the experiment: (1) ”CSR types”
who selected into the job after having been exposed to CSR information in the hiring stage and (2) ”nonCSR types” who selected into the job without information about CSR. Subjects were randomly sampled
29
from each of the four hiring stage treatment cells so as to achieve a balanced sample of types and
genders in the work stage. A total of 170 subjects participated in the work task stage of the experiment.
Each subject was observed during a period of 10 days, which implies 1700 individual-day observations.
Some subjects, however, did not supply a positive amount of hours every day; the mean number of days
worked during the experiment across all subjects was 5.31. The mean daily number of hours worked,
conditional on logging into the system, was 2.779 (SD = 3.086). On average, subjects produced 77.94
(SD = 87.16) units of output each day. Table 2 presents summary statistics for the work stage of the
experiment.
Mean
Median
SD
Min
Max
Days worked
5.311
5
3.158
1
10
Hours worked per day
1.481
0.095
2.685
0
16.279
Hours worked per day |hi t > 0
2.788
1.608
3.151
0.001
16.279
Daily earnings ($)
37.956
21.990
43.896
0.012
179.069
Output
77.744
46.000
87.085
1
613
0.797
0.856
0.191
0.033
1
Accuracy
Table 2: Summary statistics: work stage. Total number of subjects: N = 170.
Figure 5 shows the effect of wage and CSR incentives in the full sample. (Here we pool men and
women of both CSR and non-CSR types.) As expected, subjects who are paid a higher wage also supply
more labor; when increasing the wage rate from $11 to $15, the mean number of hours worked increases
from 2.475 to 2.962 (MWU: p=0.006). This implies an increase of roughly 30 minutes per day or a 20
percent increase in labor supply.
Panel (b) of figure 5 splits each wage group into CSR and non-CSR tasks. Subjects appear to increase
labor supply when working with CSR tasks. The effect of CSR is also similar in magnitude across wage
levels; subjects work on average about 15-20 minutes longer with CSR tasks than with non-CSR tasks.
The increase is, however, not found to be statistically significant in the pooled sample (MWU: p=0.130
for $11 and MWU: p=0.165 for $15).
We now proceed to examine the effect of wage and CSR treatments separately for CSR and non-CSR
type men and women. The patterns observed in figure 6 suggest that the effect of the CSR tasks vary
across these four sub-groups. In particular, while there does not appear to be any significant effect for
men, CSR type women exhibit quite a substantial increase in labor supply when working with CSR
tasks. For the female subjects with a $11 wage, the mean number of hours worked increase from 1.86
for non-CSR tasks to 2.69 in CSR tasks (MWU: p=0.021), i.e. an increase by roughly 50 minutes per day.
The corresponding increase for the female CSR type subjects with a $15 wage rage is even greater; from
2.266 to 3.326 hours per day (MWU: p=0.019). This implies an increase by around 63 minutes per day.
Given the relatively small sample of men, we are not able to draw any definitive conclusions about the
impact of CSR tasks on male labor supply.
We now turn to the effect on productivity. Figure 7 shows output per hour for the same four sub-
30
6
5
5
4
4
3
3
.
Hours
6
2
2
1
1
0
$11
0
$15
6
6
5
5
4
4
3
3
2
2
1
1
0
$11
$15
$11
$15
(b) CSR in hiring stage, women
.
Hours
(a) Neutral in hiring stage, women
Non-CSR tasks
CSR tasks
0
(c) Neutral in hiring stage, men
Non-CSR tasks
CSR tasks
$11
$15
(d) CSR in hiring stage, men
Figure 6: Labor supply by sub-group. Error bars show 5% confidence intervals.
groups as in figure 6 : CSR and non-CSR type men and women. There does not appear to be much of
an effect on productivity from either the wage or the CSR incentives for the non-CSR types. (Note that
we do not necessarily expect the wage rate to increase productivity since subjects are only paid by the
hour.) For CSR-types, however, two striking patterns emerge. CSR-type women who work with CSR
tasks appear to increase productivity from about 32.6 to 40.07 units per hour for the $11 wage group
(MWU: p=0.077) and from about 39.7 to 43.4 units per hour in the $15 wage group (MWU: p=0.130).
This implies that CSR type women do not only supply more labor when working with CSR tasks, they
also produce more output per hour. For CSR-type men, however, we see a decrease in productivity
with about 5-6 units per hour. This effect is, however, not statistically significant.
Is the fact that CSR type women increase productivity when working with CSR tasks an indication
of a shift in focus from quality to quantity? To answer this question, we examine figure 8 which shows
our measure of quality, the share of correctly entered items, across treatments. It is clear that those
31
60
50
50
40
40
30
30
.
Output per hour
60
20
20
10
10
0
$11
0
$15
60
60
50
50
40
40
30
30
20
20
10
10
0
$11
$11
$15
(b) CSR in hiring stage, women
.
Output per hour
(a) Neutral in hiring stage, women
Non-CSR tasks
CSR tasks
0
$15
(c) Neutral in hiring stage, men
Non-CSR tasks
CSR tasks
$11
$15
(d) CSR in hiring stage, men
Figure 7: Productivity by sub-group. Error bars show 5% confidence intervals.
women who increased their productivity when working with CSR tasks did not do so at the expense
of quality. On the contrary, quality actually goes up when CSR-type women work with CSR tasks.
CSR-type men, however, appear to be increasing their accuracy even more than the women. This is the
case for both the $11 and $15 wage groups. Also note that non-CSR type men in the $15 wage group
increase their accuracy when working with CSR tasks. This suggests that men to a greater degree than
women respond to CSR incentives in terms of increased quality. These patterns are robust to several
definitions of quality.
In order to summarize the results on the intensive margin, we first note that workers who are paid
a higher wage work longer hours. Moreover, for a given wage level, subjects work longer hours if
they are working with a CSR task. This pattern is primarily driven by the effect of CSR incentives on
women who selected into the job with information about CSR. In terms of productivity, women also
respond strongly to CSR incentives. The finding that women appear to be more responsive to social
cues such as CSR is consonant with insights from the literature (Croson and Gneezy 2009). Women
32
1.0
0.8
0.8
0.6
0.6
.
Share correct
1.0
0.4
0.4
0.2
0.2
0.0
$11
0.0
$15
(b) CSR in hiring stage, women
1.0
1.0
0.8
0.8
0.6
0.6
.
Share correct
(a) Neutral in hiring stage, women
$11
Non-CSR tasks
CSR tasks
$15
0.4
0.4
0.2
0.2
0.0
$11
$15
0.0
(c) Neutral in hiring stage, men
$11
Non-CSR tasks
CSR tasks
$15
(d) CSR in hiring stage, men
Figure 8: Accuracy by sub-group. Error bars show 5% confidence intervals.
are also more productive when working with CSR tasks. This increase does not appear to be at the
expense of of quality. Men who work with CSR tasks appear to be producing higher quality of output
but at a slower rate. Overall, these patterns suggests that CSR incentives draw out higher output
and productivity from women and higher quality for men. The stark contrast in performance and
response to treatments between CSR and non-CSR types suggests that selection plays an important
part in driving the results.
5
Structural Estimation Results
5.1 Leisure Preference
Figure 9 shows the estimated cost and the marginal cost functions. Due to our normalization procedure,
the functions presented in the figure correspond to the median worker in the $15 wage group. In order
33
to induce the median worker to supply 8 hours of labor, we would have to compensate the worker
with a total amount of $500. This corresponds to an increase in the wage rate from $15 to $60 per hour.
Also note that to induce between 10 to 16 hours of work, we would have to incentivize the workers
quite substantial amounts. Again, it is important to note that this is true only for the median worker
in our sample. The median worker in the $15 wage treatment supplied an average of 1.7 hours of
labor each day. Those subjects who actually worked an average of 10 hours or more per day did so
at a much lower compensation that what the median worker would require. Since we are estimating
the homogenous part of the cost function c(·), the actual cost function for an individual with a higher
(lower) cost of leisure would be scaled up (down) by a multiplicative constant, namely the individual
θLi .
Table 3 shows the estimated treatment effects for men and women. Recall that a lower θLi t implies
a smaller alternative cost of time and, ceteris paribus, a greater labor supply. The treatment effect for
males is βL1 = 0.9859. This estimate is, however, not significantly different from one. (Recall that the
multiplicative form of θL implies that an estimate of 1 indicate no effect.) Hence, we do not find any
evidence that men, conditional on selection, respond positively to work stage CSR incentives in terms
of labor supply. The estimated treatment effect parameter for women is βFL1 . The total treatment effect
for female workers is 0.78 × 0.9859 ≈ 0.769. This implies a decrease in leisure preference parameter
with roughly 25% for women who are incentivized by CSR in the work stage.
600
3000
500
2500
400
2000
c(h)
1500
300
1000
200
500
100
0
MC(h)
0
2
4
6
8
10
12
14
0
16
(a) Cost function
0
2
4
6
8
10
12
14
16
(b) Marginal cost function
Figure 9: Estimated cost and marginal cost functions.
5.2 Productivity
Table 4 shows the estimated treatment and selection effects for the productivity parameter. A higher
value of θP implies a more productive worker since a given unit of output takes less time to produce.
First note that the overall baseline productivity type is 1.514 (SD = 0.1.001). There are no large dif-
34
Parameter
Point estimate
Standard error
βL1
0.9859
0.0245
βFL1
0.7800
0.0441
Table 3: Estimates of leisure preference treatment effects for men and women.
Type of effect on θP
Parameter
Learning
Point estimate
Standard error
p-value
δ
0.8556
0.0136
< 0.001
Treatment, males
βP 1
0.9933
0.0271
0.805
Selection, males
βP 0
1.2633
0.3243
0.417
Treatment, females
βFP 1
βFP 0
βFP 0 × βP 1
βFP 0 × βP 0
1.0336
0.0337
0.319
1.1105
0.3485
0.751
1.0267
0.0189
0.158
1.4030
0.1877
0.033
Mean θP (SD)
Baseline
Neutral recruited
CSR recruited
Total sample:
1.514 (1.001)
0.996 (0.695)
2.032 (0.994)
Male sample:
1.448 (0.961)
1.056 (0.756)
1.806 (1.003)
Female sample:
1.537 (1.017)
0.976 (0.679)
2.116 (0.985)
Selection, females
Total treatment, females
Total selection, females
Table 4: Productivity parameter θP estimates. Reporting bootstrapped standard errors.
ferences in baseline productivity types across genders.The male baseline productivity type is slightly
lower than the overall baseline, 1.448 (SD = 0.961), while the female baseline productivity is slightly
higher 1.537 (SD = 1.017). The estimated distributions of the productivity parameter for male and
female workers is shown in figure 10. It is clear that the distribution of θP is very similar between
male and female workers in the sample. There is a small tendency for female workers to have slightly
greater productivity parameters in the upper part of the distribution. This difference is, however, not
sufficiently large to make us conclude that there is any baseline productivity difference between men
and women. This means that from the outset, before CSR incentives come into play, male and female
workers start out with roughly the same productivity distributions.
When we introduce CSR, a different pattern emerges. First, consider the treatment effect. Table
4 indicate that there is no significant response in productivity from work stage CSR incentives for
men. The estimated treatment effect for male workers is 0.9933, not significantly different from one.
The full treatment effect for women is only slightly larger: 1.0267. But the effect is not statistically
significant. Since there are no significant treatment effects, we conclude that that work stage CSR
incentives do not work particularly well in inducing higher productivity from male or female workers.
This finding is supported by figures 11 and 12, showing the distribution of θP among treated and
untreated workers. The distributions essentially coincide, implying that work stage CSR incentives
35
1.0
Estimated CDFs
0.8
0.6
0.4
Full Sample
Males
Females
0.2
1
2
3
4
5
θP
Figure 10: Distributions of the productivity parameter
1.0
Estimated CDFs
0.8
0.6
0.4
0.2
0.0
Neutral treated sample
CSR treated sample
0
1
2
3
4
5
θP
Figure 11: Distributions of the productivity parameter by work stage incentive (both genders).
have no effect on productivity regardless of baseline productivity level.
We now turn to the selection effect. Figure 13 indicate that workers who select into the job under
CSR incentives are significantly more productive than those who select in without incentives. From table 4, we see that the male productivity type among the neutral recruited workers is 1.056 while the the
one for CSR recruited workers is 1.806. The corresponding productivity parameters for female workers
36
Estimated CDFs
1.0
0.8
0.6
0.4
0.0
1.0
Estimated CDFs
Neutral treated sample: Females
CSR treated sample: Females
0.2
0
1
2
3
4
5
0.8
0.6
0.4
Neutral treated sample: Males
CSR treated sample: Males
0.2
0.0
0
1
2
3
4
5
θP
Figure 12: Distributions of the productivity parameter by work stage incentive and gender
are 0.976 and 2.116. These are quite substantial differences. The estimate for the male selection effect is
positive but not statistically significant. (This is most likely due to the relatively small sample of males
in our experiment.) In other words, it does not appear as if male worker productivity is affected by CSR
incentives in either the hiring or work stage. The selection effect for women is, however, significant.
The total selection effect for women is 1.403, meaning that women who select into the job under CSR
incentives are about 40% more productive. These patterns are verified by figure 14. In particular, note
the substantial difference in the distribution of θP for the different types of women in the sample. It is
clear that the hiring stage CSR incentive bring in women who are a great deal more productive than
those who select in without CSR incentives.
Finally, while not of primary interest, we still note that we find evidence of learning by doing. As
δ̂ < 1, workers become more productivity over time rather than showing signs of exhaustion.
5.3 Accuracy
Table 5 shows the estimated treatment and selection effects for accuracy. A greater θ A means higher
accuracy, ceteris paribus. Note that in contrast to productivity, the accuracy parameter may assume a
negative value. The baseline accuracy parameter in the whole sample is 0.725 (SD = 0.797) while the
baseline for men is 0.514 (SD = 0.519) and the baseline for women is 0.799 (SD = 0.863). This means that
by baseline, women are considerably more accurate than men in our sample. This patten is reflected
in figure 15 showing the estimated distributions of θ A in the whole sample and for male and female
workers separately. It is clear that women are more accurate than men over most of the distribution
of θ A . Most of the difference between men and women seem to come from some female workers with
particularly high θ A .
37
1.0
Estimated CDFs
0.8
0.6
0.4
0.2
0.0
Neutral selected sample
CSR selected sample
0
1
2
3
4
5
θP
Figure 13: Distributions of the productivity parameter by hiring stage incentive (both genders).
Estimated CDFs
1.0
0.8
0.6
0.4
0.2
0.0
1.0
Estimated CDFs
Neutral selected sample: Females
CSR selected sample: Females
0
1
2
3
4
5
0.8
0.6
0.4
Neutral selected sample: Males
CSR selected sample: Males
0.2
0.0
0
1
2
3
4
5
θP
Figure 14: Distributions of the productivity parameter by hiring stage incentive and gender
We now turn to treatment effects on accuracy. Figure 16 show the distribution of θ A for treated and
untreated workers. It is clear that the work stage CSR incentive shifts the distribution towards higher
accuracy in the whole sample. From table 5 we see that the treatment effect for males is 0.6654 while the
total treatment effect for women is 0.3265. This means that work stage CSR incentives induce higher
accuracy for both men and women. The effect on male workers is particularly large. (This is partly
38
driven by the fact that men by baseline have lower accuracy than women.) These patterns are verified
by figure 17. Work stage CSR incentives appear to shift the whole distribution of accuracy for men but
primarily work on the middle of the distribution for women. (Again, since there are female workers
with very high baseline accuracy, we do not expect a particularly large treatment effect on upper part
of the distribution for women.)
Type of effect on θ A
Parameter
Treatment, males
Point estimate
Standard error
p-value
β A1
0.6654
0.0871
0.00017
Selection, males
β A0
0.2708
0.1394
< 0.00001
Treatment, females
βFA1
−0.3389
0.1289
< 0.00001
Selection, females
βFA0
βFA0 + β A1
βFA0 + β A0
0.0871
0.2060
0.00002
0.3265
0.0945
< 0.00001
0.3579
0.1536
< 0.00001
Mean θ A (SD)
Baseline
Neutral recruited
CSR recruited
Total sample:
0.725 (0.797)
0.560 (0.852)
0.890 (0.704)
Male sample:
0.514 (0.519)
0.382 (0.395)
0.634 (0.593)
Female sample:
0.799 (0.863)
0.618 (0.950)
0.985 (0.723)
Total treatment, females
Total selection, females
Table 5: Accuracy parameter θ A estimates. Reporting bootstrapped standard errors.
Just like in the case of productivity, we can examine the selection effect by observing the distribution
of θ A for those in the sample who selected into the job with and without CSR incentives. Figure 18
indicates that selection is important primarily in the lower part of the distribution for the full sample.
If we split these distributions by gender, we see that both male and female workers are affected by
hiring stage CSR incentives. The overall pattern appear very similar to the treatment effect. Table 5
indicate a selection effect of 0.2708 on male workers and 0.3579 on female workers. While the treatment
effect was greater for men than for women, the selection effect appear to be greater for women. This
suggests that hiring stage CSR incentives works well in bringing in women with high accuracy.
In order to summarize the structural analysis, we conclude that our results are largely consistent
with the reduced form findings presented in the previous section. The baseline productivity levels are
similar across male and female workers. CSR incentives primarily work in the hiring stage by inducing
more productive workers to select into the job. This is true for both male and female workers, but in
particular for female workers. Work stage CSR incentives do not seem to have any significant effect.
For accuracy, we observe significant effects in both the hiring and the works stage. Female workers
have higher accuracy by baseline. While the treatment effect induces higher accuracy from female
workers, the baseline level is already so high that we only see limited improvement. Treatment effect on
male workers is, however, particularly large, bringing them closer to the females in terms of accuracy.
This means that the work stage CSR incentive causes a convergence in performance across men and
women both in terms of accuracy. Selection also appear to be important for accuracy. Both male and
39
1.0
Estimated CDFs
0.8
0.6
0.4
Full Sample
Males
Females
0.2
−4
−3
−2
−1
0
1
2
3
4
5
θA
Figure 15: Distributions of the accuracy parameter.
1.0
Estimated CDFs
0.8
0.6
0.4
0.2
0.0
−3
Neutral treated sample
CSR treated sample
−2
−1
0
1
2
3
4
5
θA
Figure 16: Distributions of the accuracy parameter by work stage incentive (both genders).
female workers respond well to hiring stage CSR incentives. The effect is particularly large for females.
The estimated treatment effects on leisure preference appear to indicate that women increase their
labor supply when incentivized by CSR. There is room for much further analysis regarding leisure
preference.
40
Estimated CDFs
1.0
0.8
0.6
0.4
0.0
−3
1.0
Estimated CDFs
Neutral treated sample: Females
CSR treated sample: Females
0.2
−2
−1
0
1
2
3
4
5
6
0.8
0.6
0.4
Neutral treated sample: Males
CSR treated sample: Males
0.2
0.0
−3
−2
−1
0
1
2
3
4
5
6
θA
Figure 17: Distributions of the accuracy parameter by work stage incentive and gender.
1.0
Estimated CDFs
0.8
0.6
0.4
0.2
0.0
−3
Neutral selected sample
CSR selected sample
−2
−1
0
1
2
3
4
5
θA
Figure 18: Distributions of the accuracy parameter by hiring stage incentive (both genders).
6
Concluding Remarks
The ”business of business is business” has been a mantra associated with Milton Friedman for nearly
half a century. Having articulated his view in a 1970 New York Times article, that the only responsibility
41
Estimated CDFs
1.0
0.8
0.6
0.4
0.0
−3
1.0
Estimated CDFs
Neutral selected sample: Females
CSR selected sample: Females
0.2
−2
−1
0
1
2
3
4
5
6
0.8
0.6
0.4
Neutral selected sample: Males
CSR selected sample: Males
0.2
0.0
−3
−2
−1
0
1
2
3
4
5
θA
Figure 19: Distributions of the accuracy parameter by hiring stage incentive and gender.
of business was to increase its profits, he described contrarians as ”puppets of the intellectual forces that
have been undermining the basis of a free society.” Today, over 90% of major businesses have specific
programs dedicated to corporate social responsibility, and many CEOs trumpet their organization’s
commitment to social engagement.
Critics view such facts as strong evidence that Friedman was wrong. But was he? A narrow interpretation of Friedman certainly has proven incorrect, as the business world has evolved in a much more
social manner than Friedman likely anticipated. This does not, however, mean that CSR is necessarily
at odds with maximizing profits.
This study takes a systematic approach to exploring the supply side effects of CSR. A great deal
of previous empirical scholarly exercises have focused on the demand side with mixed results. This
has led to hotly-contested debates in corporate law, business ethics, economics, and the social sciences
more broadly. Our study combines theory with a natural field experiment to explore how a firm can use
CSR to impact its labor supply and subsequent effort levels and productivities of workers. In this way,
our study is unique in that we begin with a well specified model of CSR and estimate its behavioral
parameters using a field experiment.
Our results are generally positive for CSR, in that practical, market-based approaches can be used
to affect the supply side of business. For instance, advertising the firm as a CSR firm increases the
application probability by roughly the same amount as an increase of hourly wages from $11 to $15.
In addition, once employed, CSR incentives matter at each wage level in that those working for CSR
firms work more hours. In terms of output per hour, CSR incentives work to improve productivity of
all workers, but especially women. Importantly, there is no concomitant decrease in product quality.
42
6
These gender differences in response to CSR may seem at odds with findings from some recent studies
on the effect of CSR. In particular Bode and Singh (2014), who find no differences in CSR response
across men and women in a consultancy firm. It is, however, possible that the type of men who select
into a female-dominated market such as data entry differ from the general population just like those
women who select into a male-dominated field such as management consulting.
We conclude that CSR should not be viewed as a necessary distraction from a profit motive, but
rather as an important part of profit maximization similar to other non-pecuniary incentives. Our
results highlight that the real challenge for CEOs is not whether CSR should be used, but what are the
best social innovations to put in place. We view our work as only a start on how to leverage economic
theory and field experiments to lend insights into CSR. The next steps should refine our approach
and utilize different forms and incentives of CSR to provide a more complete understanding of the
underlying motivations and behaviors on both the demand and supply sides of the market.
References
Ashraf, N., Berry, J., and Shapiro, J. M. (2010). Can higher prices stimulate product use? evidence from
a field experiment in zambia. American Economic Review, 100:2383–2413.
Barnett, M. L. and Salomon, R. M. (2012). Does it pay to be really good? addressing the shape of the
relationship between social and financial performance. Strategic Management Journal, 33(11):1304–
1320.
Bode, C. and Singh, J. (2014). Taking a hit to save the world: employee participation in social initiatives.
Technical report, Working paper.
Burbano, V. C. (2015). Getting virtual workers to do more by doing good: Field experimental evidence.
Technical report, Working paper.
Burbano, V. C. (2016). Social responsibility messages and worker wage requirements: Field experimental evidence from online labor marketplaces. Organization Science, 27:1010–1028.
Carnahan, S., Kryscynski, D., and Olson, D. (2015). How corporate social responsibility reduces employee turnover: evidence from attorneys before and after 9. Technical report, Working paper.
Chandler, J., Mueller, P., and Paolacci, G. (2014). Nonnaivite among amazon mechanical turk workers:
Consequences and solutions or behavioral researchers. Behavior research methods, 46(1):112–130.
Croson, R. and Gneezy, U. (2009). Gender differences in preferences. Journal of Economic literature,
47(2):448–474.
DellaVigna, S., List, J. A., and Malmendier, U. (2012). Testing for altruism and social pressure in charitable giving. The Quarterly Journal of Economics, 127:1–56.
DellaVigna, S., List, J. A., Malmendier, U., and Rao, G. (2015). Voting to tell others. Technical Report
19832, NBER.
43
D’Haultfoeuille, X. and Février, P. (2015). Identification of nonseparable triangular models with discrete
instruments. Econometrica, 83(3):1199–1210.
Du, S., Bhattacharya, C., and Sen, S. (2011). Corporate social responsibility and competitive advantage:
Overcoming the trust barrier. Management Science, 57(9):1528–1545.
Elfenbein, D. W., Fisman, R., and McManus, B. (2012). Charity as a substitute for reputation: Evidence
from an online marketplace. The Review of Economic Studies, 79(4):1441–1468.
Flory, J. A., Leibbrandt, A., and List, J. A. (2014). Do competitive workplaces deter female workers? A
large-scale natural field experiment on job-entry decisions. The Review of Economic Studies, 82(4).
Friedman, M. (1970). The social responsibility of business is to increase its profits. The New York Times
Magazine, pages 32–33.
Gill, D. and Prowse, V. (2012). A structural analysis of disappointment aversion in a real effort competition. The American economic review, 102(1):469–503.
Godfrey, P. C., Merrill, C. B., and Hansen, J. M. (2009). The relationship between corporate social
responsibility and shareholder value: An empirical test of the risk management hypothesis. Strategic
Management Journal, 30(4):425–445.
Goodman, J. K., Cryder, C. E., and Cheema, A. (2013). Data collection in a flat world: The strengths
and weaknesses of mechanical turk samples. Journal of Behavioral Decision Making, 26(3):213–224.
Greening, D. W. and Turban, D. B. (2000). Corporate social performance as a competitive advantage in
attracting a quality workforce. Business & Society, 39(3):254–280.
Harrison, G. W. and List, J. A. (2004). Field experiments. Journal of Economic Literature, 42(4):1009–1055.
Hickman, B. R., Hubbard, T. P., and Paarsch, H. J. (2016). Identification and estimation of a bidding
model for electronic auctions. Quantitative Economics, forthcoming.
Leibbrandt, A. and List, J. A. (2014). Do women avoid salary negotiations? A large-scale natural field
experiment. Management Science, 61(9).
Levitt, S. D. and List, J. A. (2007). What do laboratory experiments measuring social preferences reveal
about the real world? The journal of economic perspectives, 21(2):153–174.
List, J. A. (2011). The market for charitable giving. The Journal of Economic Perspectives, 25(2):157–180.
McDonnell, M.-H. and King, B. (2013). Keeping up appearances reputational threat and impression
management after social movement boycotts. Administrative Science Quarterly, 58(3):387–419.
Paolacci, G. and Chandler, J. (2014). Inside the turk: Understanding mechanical turk as a participant
pool. Current Directions in Psychological Science, 23(3):184–188.
Roback, J. (1982). Wages, rents, and the quality of life. The Journal of Political Economy, pages 1257–1278.
44
Rosen, S. (1986). The theory of equalizing differences. Handbook of labor economics, 1:641–692.
Servaes, H. and Tamayo, A. (2013). The impact of corporate social responsibility on firm value: The
role of customer awareness. Management Science, 59(5):1045–1061.
Torgovitsky, A. (2015). Identification of nonseparable models using instruments with small support.
Econometrica, 83(3):1185–1197.
Waddock, S. A. and Graves, S. B. (1997). The corporate social performance-financial performance link.
Strategic management journal, pages 303–319.
Wren, D. A. (2005). The history of management thought. John Wiley & Sons.
45
Appendix A: Accuracy with Learning Effect
Accuracy measures whether or not unit q in period t was correctly entered by i . For our purposes,
∗
higher accuracy directly translates to greater output quality. Define the latent accuracy indexer a qi
as
t
∗
a qi
= Θ Aqi + εqi
(32)
where εqi ∼ N (0, 1). The accuracy parameter is defined as
Θ Aqi = Θ Ai + Tqi + β A2 ln(q i )
where
Θ Aqi = θ Ai + β A0 X 0i + βFA0 X 0i D iF
is the baseline accuracy type and
Tqi = β A1 X 1qi + βFA1 X 1qi D iF
is the treatment effect. Note that β A2 is a learning parameter introduced to account for both learningby-doing (higher quality over time) and the risk of exhaustion (lower quality over time). We do not
∗
observe the latent accuracy indexer a qi
directly. Instead, we observe a qi t ∈ {0, 1} where
t
a qi

1
=
0
∗
if a qi
>0
otherwise
This implies
Pr[a qi = 1|Θ Aqi ] = Φ(−Θ Aqi )
where Φ(·) is the standard normal CDF.
As in the case of leisure preference and productivity, we will construct a differencing estimator.
In this case, however, we have to make due with non-linear difference equations. We begin with a
difference equation for learning, we then derive difference equations for treatment and selection effects.
Finally, we formulate a Probit estimator for the individual fixed effects.
6.0.1
Accuracy Learning Effects
Let ∆a qi ≡ a qi − a qi −1 , qi ≥ 2 and note that
∆a qi


−1


= 0




1
∗
∗
>0
< 0 and a qi
if a qi
−1
∗
∗
∗
∗
> 0)
> 0 and a qi
< 0) or (a qi
< 0 and a qi
if (a qi
−1
−1
∗
∗
<0
> 0 and a qi
if a qi
−1
This implies
Pr[∆qi = −1] = Φ(Θ Aqi )Φ(−Θ Aqi −1 )
46
Pr[∆qi = 0] = Φ(Θ Aqi )Φ(Θ Aqi −1 ) + Φ(−Θ Aqi )Φ(−Θ Aqi −1 )
and
Pr[∆qi = 1] = Φ(−Θ Aqi )Φ(Θ Aqi −1 )
With data on within-person differences of accuracy for units produced on the same day, this pins down
the learning effect β2 given some Θ Ai , Tqi since
E [∆a qi ] = Φ(−Θ Ai − Tqi − β A2 ln(q i ))Φ(Θ Ai + Tqi + β A2 ln(q i − 1))
−Φ(Θ Ai + Tqi + β A2 ln(q i ))Φ(−Θ Ai − Tqi − β A2 ln(q i − 1))
≡ A (q i ; Θ Ai , Tqi )
The only part of E [∆a qi ] that changes with each ∆A qi is ln(qi ). Hence, we can write A (qi ; Θ Ai , Tqi ) as a
function of just qi and the parameters Θ Ai and Tqi .
6.0.2
Accuracy Treatment Effects
Let
Y
1
Aji
=
A
ji
=
 ∑Q
i
1[A =1 & X 1qi =1( j =C )]


 qi =1 qi
if


0
otherwise
∑Q i
q i =1 1[X 1q i
=1( j =C )]
∑Q i
q i =1 1[X 1q i
 ∑Q
i

 qi =1 Φ(−Θ Ai −Tqi −β A2 ln(qi ))1[X 1qi =1( j =C )]

∑Q i
qi


0
=1 1[X 1q i =1( j =C )]
if
= 1( j = C )] > 0
∑Q i
q i =1 1[X 1q i
= 1( j = C )]
otherwise
j = C,N
and note that for all i such that
∑Q i
q i =1 1[X 1q i
E
= 1( j = C )] > 0, j = C , N
) ( 1
)]
[( A
A
1
Y C i − Y N i − AC i − A N i = 0
This pins down β A1 and βFA1 given som Θ Ai .
6.0.3
Accuracy Selection Effects
Define
∑I
ωrAl
=
i =1
∑I
0l
Ar
=
i =1
∑Q i
F
q i =1 1[A q i = 1 & D i
∑I ∑Q i
F
i =1 q i =1 1[D i = 1(l
∑Q i
= 1(l = F ) & X 0i = 1(r = C )]
= F ) & X 0i = 1(r = C )]
F
q i =1 Φ(−Θ Aq i )1[D i = 1(l = F ) & X 0i = 1(r = C )]
∑I ∑Q i
F
i =1 q i =1 1[D i = 1(l = F ) & X 0i = 1(r = C )]
47
and note that
E
and
E
[(
) ( 0M
)]
0M
AM
ωN
− ωCAM − A N − AC
=0
) ( 0M
)]
[(
) (
) ( 0F
0F
0M
AF
AM
ωN
− ωCAF − ωN
− ωCAM − A N − AC − A N − AC
=0
This pins down β A0 and βFA0 due to random assignment and
[
]
E θ A |X 0i , D iF = E [θ A ]
6.0.4
Accuracy Fixed Effects
By definition of the Probit model, we have
E [A qi |Θ Ai , Tqi , q i ] = Φ(−Θ Ai − Tqi − β A2 ln(q i ))
The above equations give us a combined NLS estimator of the following form:
[
]
{θ̂ Ai }iI =1 , β̂ A1 , β̂FA1 , β̂ A0 , β̂FA0 , β̂ A2 =
{
arg min
γ
+
I [(
∑
i =1
A
Y Ci
−Y
A
Ni
Qi [
I ∑
∑
]2
∆A qi − A (q i , Θ Ai , Tqi ) 1[Tqi = Tqi − 1]
}
i =1 q i =1
[
]
Qi
Qi
) ( 1
)]2
∑
∑
1
− AC i − A N i
1
1[X 1qi = 1] > 0 &
1[X 1qi = 0] > 0
q i =1
q i =1
) ( 0M
)]
[(
0M 2
AM
− ωCAM − A N − AC
+ ωN
[(
) (
) ( 0F
) ( 0M
)]
0F
0M 2
AF
AM
− ωCAF − ωN
− ωCAM − A N − AC − A N − AC
+ ωN
{
+
Qi (
I ∑
∑
i =1 q i =1
}
)
A qi − Φ(−Θ Aqi )2
48
Appendix B: Analytical Solution for E [θ j (i )]
This appendix presents a method for analytically computing an empirical analog to the left hand side
of
(
I −1
E [θ (i )] = I
i −1
)∫
1
j
0
Q j (t , α j )t i −1 (1 − t )I −i d t ,
∀i = 1, 2, ..., I ,
j ∈ {P, A, L}
Commonly used methods employ the empirical quantile function:
{
I
1∑
j
Qb j (r ) ≡ inf ρ :
1(θ ≤ ρ) ≥ r
I i =1 i
}
But this introduces kinks into the GMM objective function which complicates asymptotic theory and
standard errors. Alternatively, one could consider a simulation-based estimator of quantile i /(I −1) for
some i = 1, 2, ..., I by the following procedure:
j
j
j
1. For s = 1, 2, ..., S , sample I draws from {θ1 , θ2 , ..., θI } with replacement, and order the observations
j
j
to get a sample of simulated order statistics {θ̃s j (1) , θ̃s(2) , ..., θ̃s(3) }
2. Estimate
S
1∑
à
j
j
E [θ(i ) ] =
θ̃
S s=1 s(i )
for any desired i = 1, 2, ..., I .
This simulation method, however, introduces simulation error and also complicates asymptotic theory
and standard errors as well.
à
j
Iit is, however, possible to analytically compute the asymptotic value of E [θ(i ) ] as S → ∞. To do so,
j
j
j
j
j
begin with the ordered data {θ(1) ≤ θ(2) ≤ ... ≤ θ(I ) } and note that if we can compute P km = Pr(θ̃s(m) = θ(k) )
for a given simulated sample S , then
I
∑
à
j
lim E [θ(m) ] =
P km θ j (k)
S→∞
k=1
which eliminates simulation error altogether. Note that
j
j
P km = Pr(θ̃s(m) = θ(k) ) =
I
∑
t =1
Pr(exactly t draws = θ j (k) & m draws ≤ θ j (k))
j
Assuming that all observations θi are unique, then the probability that
(l draws are < θ j (k)) ∩ (t draws are = θ j (k)) ∩ (l − t − l draws are > θ j (k))
is given by a multinomial probability
(
) ( ) (
)
(
[
])
I!
k − 1 l 1 I −t I − k I −t −l
k −1 1 I −k
= M N [l , t , I − t − l ],
, ,
t !l !(I − t − l )!
I
I
I
I
I
I
49
(33)
The intuition here is that when sampling with replacement from the empirical distribution, every observation has uniform weight 1/I , the fraction of values ranked below k is (k − 1)/I , and the fraction
ranked strictly above k is (I − k)/I . Whenever there are m simulated draws that are ≤ θ j (k) with t total draws begin equal to θ j (k), that opens up various mutually exclusive possibilities for how many
draws are strictly below a value of θ j (k). For example, suppose that t = 3. Then if m = 1, there can be
no simulated draws from the data strictly below θ j (k) but if m = 4, ..., I −3 then the four possibilities are
for exactly 0, 1, 2, and 3 draws below θ j (k). If we extend this logic to arbitrary t , m , k , and I , we can
rewrite (33) as
Pr(exactly t draws = θ j (k) & m draws ≤ θ j (k)) =
(
[
])
I min{m−1,I
∑
∑ −t }
k −1 1 I −k
M N [l , t , I − t − l ],
, ,
I
I
I
t =1 l =max{0,m−t }
(34)
By (34), we can explicitly compute P km , ∀(k, m) pairs with k ≤ I and m ≤ I , and analytically compute
I
∑
à
j
lim E [θ(m) ] =
P km θ j (k)
S→∞
k=1
j
using the ordered values of {θi }. But for our quantile function estimator, we pick a set of k quantile
ranks
{
r j = {r 1 , r 2 , ..., r k ∗j } ⊆
I
i
, ...,
I +1
I +1
}
with k ∗j of free parameters in Q j (r ; α j ), j ∈ {P, A, L} and making sure we include in the set r 1 = i /(I + 1)
and r k ∗j = I /(I + 1), and at least one quantile rank in each subinterval defined by the know vector k j .
The probability matrix P is somewhat expensive to compute, but it need only be computed once for a
fixed sample size I quantile rank vector r j , and can be stored and re-used in each iteration:


P 1,r 1 (I +1) P 1,r 2 (I +1) . . . P 1,r k∗ (I +1)


P

 2,r 1 (I +1) P 2,r 2 (I +1) . . . P 2,r k∗ (I +1) 
P=



..................


P I ,r 1 (I +1) P I ,r 2 (I +1) . . . P I ,r k∗ (I +1)
50
Appendix C: Hiring and Work Stage Experimental Material
Figure 20: Screenshot of firm website
Figure 21: Example login information: for-profit client
51
Figure 22: Example login-information: CSR client
Figure 23: Example main screen: for-profit client
52
Figure 24: Example main screen: CSR client
Figure 25: Example data entry screen (top half)
53
Figure 26: Example data entry screen (bottom half)
54