What we value

WATCH THE JAMES DEAN AND
DURATION NEGLECT HYPOTHESES
CRASH AND BURN:
LESSONS FOR EXPERIMENTERS
2017 Empirical Philosophy Workshop – VUW
Dan Weijers & Peter Unger
THE GENERAL APPROACH
• Thought experiments are often
used to elicit views about the
value of things
• Thought experiments can mislead
us in ways we are not aware of
• 2 & 3 are dubious and can be
challenged by x-phi
• 1 isn’t dubious, but x-phi might
reveal that people don’t accept
it
1. An experience machine life has
more happiness than a normal life
2. If happiness is all that really matters
in life, then the vast majority of
people would choose an
experience machine life over a
normal life
3. The vast majority of people would
not choose an experience
machine life over a normal life
4. Therefore, happiness is not all that
really matters in life
DECONSTRUCTIVE REPLICATION
• Test the original scenario
• Ask why they chose that
• Assess the justifications and
relevant biases literature to
see what modifications are
required
• Make 1 change at a time,
test new scenarios on similar
groups, compare results
There is this amazing experience
machine designed by super-duper
neuropsychologists…
1. Would you live the rest of your
life in such a machine?
• Yes
• No
2. Briefly justify your choice for 1.
THE TARGET
• One aspect (so far) of Diener et al.’s
length of life cases
• Diener, E., Wirtz, D., & Oishi, S. (2001).
End effects of rated life quality: The
James Dean effect. Psychological
Science, 12(2), 124-128.
WORK IN PROGRESS WARNING
• Formal statistical analyses have not yet been run
• Formal qualitative analyses have not yet been run
HYPOTHESES ABOUT THE VALUE OF
HAPPY LIVES OF VARIOUS LENGTHS
• The Common Sense Hypothesis
• Longer lives are more desirable
than shorter lives
• The James Dean Hypothesis
• Shorter lives are more desirable
than longer lives
• The Duration Neglect Hypothesis
• The length of a life doesn’t
affect its desirability
• Doesn’t have a preference
Number of happy years
100
90
80
70
60
50
C
40
30
20
10
B
A
0
James Dean
Not Prefered
Common Sense
(PARTIAL) REPLICATION OF
DIENER ET AL.
Jen [Jan] was a never-married woman without children. Her life was
extremely happy, with enjoyable work, vacations, friends, and pleasant
leisure. Jen died suddenly and painlessly in an automobile accident, when
she was 30 [60] years old.
1. Taking Jen’s life as a whole,
how desirable was her life?
2. How much total happiness
would you say that Jen
experienced in her life?
Most undesirable
1
O
Most unhappy
1
O
2
O
2
O
3
O
3
O
4
Neutral
5
4
Neutral
5
O
O
O
O
3. Please briefly explain your answer to questions 1&2:
6
O
6
O
7
O
7
O
8
O
8
O
Most
desirable
9
O
Most
happy
9
O
• Diener et al. took the average of the
desirability and total happiness responses
for each respondent, and then
compared the average of respondents’
judgments about the 30 and 60 year life.
RESULTS
Jen
(30year
life)
Jan
(60year
life)
Jan Jen
• We did the same
Diener
et al.
6.13
6.48
0.35
• Diener et al.: “The age variable failed to
have a significant effect,... This finding
confirms the phenomenon of duration
neglect…”
Weijers
&
Unger
6.26
6.15
-0.11
• Our results are similar
Study
IMPLICATIONS FOR PHILOSOPHY?
IF DIENER ET AL. WERE RIGHT, THEN…
• Prudential value: what is good for us
• Used in moral theory, political
philosophy, applied ethics, and public
policy
• Should governments focus on making
our lives happy at the cost of them
being long?
•
•
•
•
No speed limit
Decriminalize drugs
Subsidize alcohol and drugs
Scrap cancer research in favor of
subsidizing vacations
1. Quantitative hedonism: lives are
valued by summing the net happiness
of each moment
2. Prudential theories that value lives in a
way different to the vast majority of
people are bad theories
3. The vast majority of people do not
value lives by summing the net
happiness of each moment
4. Therefore, quantitative hedonism is a
bad theory
OR WE CAN TAKE A CLOSER LOOK:
QUALITATIVE ANALYSIS
Method
There are 9 main categories of answer.
Which main category an answer fits in
depends on 3 things:
1. The question that motivated the thought
experiment (related to what it is
supposed to show)
2. The exact wording of the scenario and
questions
3. The respondents’ answers to the
questions
Then group by meaning within categories
Categories
1.
2.
3.
4.
5.
6.
7.
8.
9.
Malicious response
Opposite justification
Imaginative resistance
Overactive imagination
Reasonable resistance
Useful response
No justification
Reasonable rejection
Demonstrates misunderstanding
DECONSTRUCTIVE REPLICATION:
QUANT/QUALITATIVE ANALYSIS
Categories
Uses
1. Malicious response
• % of 6+7 answers can be used to assess a
thought experiments fit for purpose
2. Opposite justification
3. Imaginative resistance
4. Overactive imagination
5. Reasonable resistance
6. Useful response
7. No justification
8. Reasonable rejection
9. Demonstrates misunderstanding
• Compare “fit for purpose” scores of
scenario wording changes to ensure they
improve
• % of1+2+9 (+3+4+7?) can be used to gauge
the relative suitability of sample groups
• Sub-groups of 3, 4, 5, & 8 used to criticize
original & inform creation of new scenarios
• Sub-groups of 6 used as support for theories
• Could just use #s from 6 or 6+7 for quant
analysis
OR WE CAN TAKE A CLOSER LOOK:
METHODOLOGY
Potential methodological problems
Potential solutions
1. Many respondents de-valued the
lives because the women were
“never-married… without children”
1. Rewrite the scenario so it doesn’t
mention whether they were
married with children or not
2. Evaluating a life without a
comparator is difficult
2. Compare Jen’s (30) and Jan’s (60)
lives directly against each other
3. “Total happiness” data might not be
relevant to desirability
3. Do not combine the “desirability”
and “total happiness” responses
X. The scale and labels might cause
problems (“most desirable” and
“most undesirable” are ambiguous)
X. Run the study with a simpler
response mechanism (whose life is
more desirable, or are they
equal?)
OR WE CAN TAKE A CLOSER LOOK:
HARDER TEST FOR JDH & DNH
Scenario setup tweaks
Tweak details
4. Having 3 options might help
participants think more carefully
about the potential differences
between the lives
4. E.g., Compare Jen’s (30), Jay (45),
and Jan’s (60) lives directly
against each other
5. Increasing the age gap should make
the difference between the lives
more salient
5. E.g., Compare Jen’s (25) and
Jan’s (75) lives directly against
each other
6. Making the judgments about a life
you will have (rather than the life of a
stranger) should make the
participant take any differences in
value more seriously
6. E.g., “There are two lives…”
“…and assuming you are forced
to choose one to live yourself…”
AND WE CAN COMBINE TWEAKS
Which enables the investigation of…
So we can learn (approximately)…
• Whether some tweaks only seem to
have an effect in the presence or
absence of other tweaks
• About any interrelationships
between tweaks
• The average effect of each tweak
across slightly different contexts
• The relative importance and
absolute effect of each tweak
• The total effect of all of the tweaks
combined
• Just how bad the original study was,
and how people really value the
desirability of happy lives of different
lengths
TWEAK 1:
CLEAN “NEVER MARRIED” ETC.
• Clean: “Imagine someone was
extremely happy for every year of
their life, until they died suddenly
and painlessly in an automobile
accident at age 30 [60].”
Study
RESULTS
• General increase
• Results now favor commonsense
Jen (30year life)
Jan (60year life)
Jan - Jen
Weijers & Unger
Rate 1 (BS)
6.26
6.15
-0.10
Weijers & Unger
Rate 1 (BS)-clean
6.73
7.32
0.58
Weijers & Unger
Rate both (WS)
5.81
6.60
0.79
Weijers & Unger
Rate both (WS)clean
7.22
8.23
1.02
TWEAK 2:
DIRECTLY COMPARING THE LIVES
• Respondents were presented with
both lives at the same time (i.e.
within subjects design) with the
same question wording as before
RESULTS
• Increase for 60 year lives
• Results favor commonsense even
more
Study
Jen (30year life)
Jan (60year life)
Jan - Jen
Weijers & Unger
Rate 1 (BS)
6.26
6.15
-0.10
Weijers & Unger
Rate both (WS)
5.81
6.60
0.79
Weijers & Unger
Rate 1 (BS)-clean
6.73
7.32
0.58
Weijers & Unger
Rate both (WS)clean
7.22
8.23
1.02
TWEAK 3:
DESIRABILITY RESULTS ONLY
• Instead of combining
“desirability” with “total
happiness”, we just took
the desirability measure
RESULTS
• Increase (for WS)
evaluations of 60 over 30
year life
Study
30
60
Comb
combi combi ined
ned
ned
60-30
30
desire
only
60
desire
only
Desire
only
60-30
Diener et al.
Rate 1
6.13
6.48
0.35
?
?
?
Weijers & Unger
Rate 1
6.26
6.15
-0.10
5.74
5.35
-0.40
Weijers & Unger
Rate both (WS)
5.81
6.60
0.79
4.72
6.21
1.49
Weijers & Unger
Rate 1-clean
6.73
7.32
0.58
6.12
6.66
0.54
Weijers & Unger
Rate both (WS)
clean
7.22
8.23
1.02
6.40
7.83
1.43
TWEAK 4:
DIRECTLY COMPARING THE LIVES
• Added a middle
option (Jay: 45)
RESULTS
• Mixed: this had a
‘commonsense’
effect on ‘total
happiness’
judgments, but
not ‘desirability’
Study
Jen (30year life)
Jan (60year life)
Jan Jen
W&U, Rate both (WS), clean, 2-option
7.22
8.23
1.01
W & U, Rate both (WS), clean, 3-option
6.43
7.71
1.28
W&U, Rate both (WS), clean, 2-option,
bigger age gap (25-75)
6.25
8.04
1.79
W&U, Rate both (WS), clean, 3-option,
bigger age gap (25-75)
5.02
7.54
2.52
5.81
6.60
0.79
5.74
6.62
0.88
W&U, Rate both (WS), dirty, 2-option
W&U, Rate both (WS), dirty, 3-option
TWEAK 5:
BIGGER AGE GAP
• Made Jen die at
25 and Jan 75
RESULTS
• Results favor
commonsense
even more
Study
Jen (30year life)
Jan (60year life)
Jan Jen
W&U, Rate both (WS), clean, 2-option
7.22
8.23
1.01
W&U, Rate both (WS), clean, 2-option,
bigger age gap (25-75)
6.25
8.04
1.79
W&U, Rate both (WS), clean, 3-option
6.43
7.71
1.28
W&U, Rate both (WS), clean, 3-option,
bigger age gap (25-75)
5.02
7.54
2.52
• “There are two
lives…” “…and
assuming you are
forced to choose
one to live
yourself…”
RESULTS
• Results favor
commonsense
even more,
especially if just
desirability is used
TWEAK 6:
MAKING IT PERSONAL
Study
Jen (30year life)
Jan (60year life)
Jan Jen
W&U, Rate both (WS), clean, 2-option
7.22
8.23
1.01
W&U, Rate both (WS), clean, 2-option,
personal
5.64
7.62
1.98
W&U, Rate both (WS), clean, 3-option
6.43
7.71
1.28
W&U, Rate both (WS), clean, 3-option,
personal
4.84
6.39
1.55
COMBINING TWEAKS
• Including all 6 tweaks,
compare to our replication of
D’s original
RESULTS
• Huge difference: 4.04 on a 1-9
scale (over 50% movement!)
• The totally tweaked version
strongly supports
commonsense
Study
Jen (30year life)
Jan (60year life)
Jan - Jen
W&U, Rate 1 (BS)
5.74
5.35
-0.40
W&U, clean, Rate
both (WS), desi
only, 3-option,
bigger gap (25-75)
personal
4.00
7.64
3.64
DIFFERENCES TWEAKS MADE
• Using
desirability
only
Tweak
• Simple
average of
comparisons
(doesn’t
take varying
sample sizes
into
account)
# of comparisons
Average
contribution
1) Cleaning up “no kids” etc language
3
+0.54
2) Within subjects design (rate 2), rather than BS
2
+1.39
3) Desirability scores only, rather than combo
average with total happiness scores
10
+0.82
4) 3rd option added in middle
4
-0.27*
5) Bigger age difference (most 25-75 instead of
30-60)
4
+1.20
6) Making it personal, rather than about a
stranger
3
+0.79
*But did increase total happiness contribution
CONCLUSION
Specific
General
• Experimental philosophy/
psychology is hard
• It takes a long time
• Multiple methods should be
combined for more robust results
• It’s no wonder many experiments’
results aren’t highly replicable!
• Always ask the qualitative question
• Some people believe very unusual
things
• Some students think they think that
length of life doesn’t matter
• Most students (probably) think that
length of life is important, but far from
the only intrinsically important thing in
life
• i.e., there is probably a partial duration
neglect effect when we evaluate lives
in ideal conditions
• At least with these kinds of evaluations
(“online” vs “narrative/3rd
person/objective” may be different)