Are Ballot Titles Biased? Partisanship in

Are Ballot Titles Biased?
Partisanship in California’s Supervision
of Direct Democracy
Christopher S. Elmendorf and Douglas M. Spencer*
This study investigates whether and if so under what conditions the
California attorney general (AG), who authors the ballot title and
summary (label) for statewide ballot initiatives, writes ballot language that
is biased rather than impartial. State law demands an impartial label,
but commentators frequently complain that the AG chooses misleading
language to bolster (undermine) measures that the AG or the AG’s party
supports (opposes). In this Article, using a convenience sample of students
from several universities, we measure ordinary observers’ perceptions of bias
in ballot labels for initiatives dating back to 1974. Separately, we
calculate an objective measure of bias using a readability algorithm. We
then test hypotheses about AG strategy, examining whether the extent of
bias in ballot labels varies with the closeness of the election and the degree
to which the measure elicits partisan division. We also examine the
correlation between bias perceptions and observer characteristics such as
support for the ballot measure, trust in government, and social trust.
Introduction ..................................................................................................................... 512
I. Background................................................................................................................... 514
A. The Law and Politics of Ballot Labels ...................................................... 514
B. Do Ballot Labels Matter? ............................................................................. 518
II. Methods, Hypotheses, and Results ......................................................................... 521
A. Readability and AG Strategy ....................................................................... 522
1. Theory and Hypotheses ....................................................................... 522
2. Operationalization ................................................................................ 524
3. Results ..................................................................................................... 527
B. Bias as Ordinary Observers See It ............................................................. 530
* Christopher S. Elmendorf is a Professor of Law at the University of California at Davis. Douglas M.
Spencer is an Associate Professor of Law and Public Policy at the University of Connecticut. For
helpful feedback we are indebted to Sam Issacharoff, Kevin Quinn, Jed Purdy, and participants in the
UC Irvine Law Review symposium on nonpartisan election administration.
511
512
UC IRVINE LAW REVIEW
[Vol. 3:511
1. Theory and Hypotheses ....................................................................... 530
2. Operationalization ................................................................................ 533
3. Results ..................................................................................................... 537
III. Discussion and Conclusion .................................................................................... 546
INTRODUCTION
The brainchild of early twentieth-century Progressives, the ballot initiative
was supposed to enable ordinary citizens to wrest control of lawmaking from
elected officials whom the Progressives saw as marionettes of powerful interest
groups and corrupt party bosses. Whether the existence of the ballot initiative
actually results in greater congruence between state law and median voter
preferences is a subject of ongoing study and debate.1 It is clear, however, that the
political insiders whom the ballot initiative was meant to check are not passive
bystanders. Government officials have sometimes parried initiative proponents by
putting topically related referendum measures on the same ballot (potentially
confusing voters), by refusing to appropriate funds for the implementation of
enacted ballot measures, and by narrowing enacted measures through the courts.2
Government officials may also be able to bolster or undermine a proposed
initiative by controlling how the measure is characterized on the ballot and in the
state-issued ballot pamphlet. In California, the attorney general (AG)—a partisan,
elected official—authors a brief ballot label describing each measure that has
qualified for the ballot, and a somewhat longer summary for the ballot pamphlet,
which the state mails free of charge to all registered voters. The nonpartisan
Legislative Analyst’s Office (LAO) prepares a much more detailed description of
the measure and its likely effects, which appear in the ballot pamphlet following
the AG’s summary.
State law directs the AG to write evenhanded, nonargumentative ballot labels
and ballot-pamphlet summaries, but critics say AGs often flaunt this duty, crafting
biased summaries to improve or diminish a measure’s odds of passage. The LAO
has faced much less criticism. If the critics are right and if the LAO is as
nonpartisan as its name and reputation suggest, then responsibility for writing
ballot labels and ballot-pamphlet summaries arguably should be transferred from
the AG to the LAO or a similarly nonpartisan official.3 A bill to this effect was
introduced in the California Assembly in 2009.4
1. See, e.g., JOHN G. MATSUSAKA, FOR THE MANY OR THE FEW: THE INITIATIVE, PUBLIC
POLICY, AND AMERICAN DEMOCRACY (Benjamin I. Page ed., 2004); Jeffrey R. Lax & Justin H.
Phillips, The Democratic Deficit in the States, 56 AM. J. POL. SCI. 148 (2012).
2. See ELISABETH GERBER ET AL., STEALING THE INITIATIVE: HOW STATE GOVERNMENT
RESPONDS TO DIRECT DEMOCRACY 19 (Paul S. Herrnson ed., 2001).
3. State law presently provides that if the AG is the proponent (sponsor) of an initiative, then
the AG’s ordinary responsibilities with respect to that measure shall be performed by the Legislative
2013]
ARE BALLOT TITLES BIASED?
513
The present study provides the first empirical evidence regarding AG
partisanship (or its absence) in the writing of ballot labels. We create and analyze
objective and subjective measures of ballot-label bias, looking for the patterns we
would expect to find if the AG strategically manipulated ballot labels.
Our objective measure is based on the label’s readability. We argue that, with
respect to propositions that are competitive and likely to polarize voters by party
affiliation and reading level, Democratic AGs have incentives to write
exceptionally easy-to-read labels and Republican AGs have incentives to write
exceptionally difficult labels. But the data do not provide much support for these
hypotheses. With respect to competitive measures, Republican AGs wrote
significantly more difficult-to-read labels for very conservative measures than for
centrist measures. But there is no evidence that Republican AGs manipulated
readability on liberal measures, or that Democratic AGs manipulated readability at
all. Labels authored by Democratic AGs were actually more difficult to read, on
balance, than labels authored by Republicans.
Our subjective measure of ballot-label bias is based on university students’
perceptions. The students were placed behind a veil of ignorance regarding the
author of the ballot label, whether the measure passed or failed, etc. We gave
students a one- to three-page description of the measure, which we said had been
prepared by a “neutral, disinterested expert.” In point of fact, subjects received the
analysis prepared for the ballot pamphlet by the nonpartisan legislative analyst. We
then asked students to read a proposed “ballot title and summary” (the actual
ballot label), and to evaluate whether the title and summary was biased in several
different ways. Subjects who reported bias were asked whether the bias favored a
yes or no vote. Approximately ten students evaluated each label.
We model the existence and direction of perceived bias as a function of both
the individual characteristics of the survey respondents and various political
factors that we hypothesize may affect the AG’s incentive to write a biased label.
We find that bias perceptions are significantly correlated with the survey
respondents’ support for or opposition to the measure, but we do not obtain
statistically significant results with respect to the political covariates. The sign of
the coefficient on the key political interaction term is consistent with our strategic
partisanship hypothesis, but we cannot reject the null hypothesis that this is due to
chance.
Counsel instead. CAL. ELEC. CODE § 9003 (West 2003). However, the courts have allowed the AG to
write ballot labels for measures that the AG takes a very active role in promoting or opposing, so long
as the AG is not technically the “proponent,” i.e., the person who initially files the proposed measure
with the state, seeking a title and summary for the circulating petition. See, e.g., Lungren v. Superior
Court, 48 Cal. App. 4th 435, 440 n.1 (1996) (rejecting challenge to AG-titled measure for which the
AG authored the “official Rebuttal to Arguments Against [the Measure]” in the ballot pamphlet, and
rejecting plaintiffs’ argument that at the very least the AG was not owed deference on the ballot label
given his active role in promoting the measure).
4. Assemb. B. 319, 2009–2010 Reg. Sess. (Cal. 2009).
514
UC IRVINE LAW REVIEW
[Vol. 3:511
I. BACKGROUND
This Part motivates our study by briefly explaining the AG’s legal
responsibilities and duties in the ballot-initiative process, as well as the political
science literature regarding the influence of ballot labels on voter decision making
in initiative and referendum elections.
A. The Law and Politics of Ballot Labels
California delegates responsibility for producing the brief descriptions of
ballot measures that citizens encounter when performing their direct democracy
agenda setting and enactment functions to the attorney general. Proponents who
wish to put a measure on the ballot first submit the text of the measure to the AG
and pay a filing fee. Based on the submitted text, the AG writes a title and
summary for the “circulating petition,” the document that proponents must use to
gather the signatures necessary to qualify the measure for the ballot.5 For each
measure that qualifies, the AG prepares a “title and summary” and a shorter
“ballot label” of up to seventy-five words.6 The title and summary introduce the
measure in the state-issued ballot pamphlet, which is mailed to all registered voters
shortly before voting begins.7 Voters who dig deeper into the pamphlet also find a
detailed analysis of the measure prepared by the nonpartisan Legislative Analyst’s
Office, as well as arguments for and against the measure submitted by activists on
either side.8
On the ballot, initiatives are assigned a number and described by the AGauthored ballot label.9 The label must include the measure’s expected fiscal
impact, as projected by the legislative analyst,10 but everything else falls to the
AG’s discretion. The AG’s discretion is not unfettered, however. The AG is
bound by law to “give a true and impartial statement of the purpose of the
measure,” and not to use language that constitutes an “argument” or that is “likely
to create prejudice, for or against the proposed measure.”11 The AG’s job is to
5. CAL. ELEC. CODE §§ 9004–06 (West 2003).
6. Id. § 9051.
7. Id. § 9094.
8. Id. § 9086(c).
9. Id. §§ 9050–53.
10. Id. § 9051(2)(b).
11. Id. § 9051(2)(c). While this command explicitly applies only to the “ballot title and
summary” (i.e., the description in the ballot pamphlet), it is clear by implication and from other
provisions in the California Election Code that it also applies to the circulating title and summary, and
to the ballot label. See id. Section 9051(2)(b) states that the ballot label is to be a “condensed version
of the ballot title and summary . . . .” Id. § 9051(2)(b). As for the circulating petition title and
summary, it “shall be prepared in the manner provided for the preparation of ballot titles and
summaries in Article 5 (commencing with Section 9050)[.]” Id. § 9004(a).
Moreover, in an unpublished opinion, the California Court of Appeal treated the duty of
impartiality in the labeling of ballot measures as derived from state constitutional provisions
concerning free elections, and, as such, binding on any state actor to whom the legislature might
2013]
ARE BALLOT TITLES BIASED?
515
“reasonably inform the voters of the character and purpose of the proposed
measure,” not to be an advocate for or against it.12
Often, however, the AG has come under attack for crossing the line between
information and advocacy. For example, a Sacramento Bee columnist wrote that
Democratic AG Kamala Harris’s circulating titles and summaries for a pair of
pension reform measures were “nonsense” that “got many on the right screaming
and even partisans on the left privately squirming.”13 Even the liberal Fresno Bee
complained that the “official description of the two measures read like talking
points taken straight from a public employee union boss’ campaign handbook”—
and that these talking points were “not true.”14 Initiative proponents blamed
Harris’s “ugly, partisan, and manipulative”15 language when they withdrew their
petition, claiming that “the Attorney General’s false and misleading title and
summary makes [sic] it nearly impossible to pass.”16
Attacks have sounded in the courtroom too, but the courts have not been
especially receptive. While California’s courts review ballot labels and voterpamphlet language for compliance with the statutory standard, and while they
have been willing to order specific changes in the event of a successful challenge,17
the standard of review is deferential. If “‘reasonable minds may differ as to the
sufficiency’” of AG-prepared ballot materials, the legal challenge will be rejected.18
“‘[O]nly in a clear case should . . . [the AG’s materials] be held insufficient.’”19 The
normative touchstone is whether the label risks “misleading the public with
inaccurate information.”20
Beyond restating the standard of review, it is not easy to characterize what
the courts generally look for and do when adjudicating these challenges. The
delegate this responsibility (including the legislature itself). Clark v. Superior Court, No. C064430,
2010 WL 928384, at 5–6 (Cal. Ct. App. Mar. 16, 2010).
12. Yes on 25, Citizens for an On-Time Budget v. Superior Court, 189 Cal. App. 4th 1445,
1452 (2010).
13. George Skelton, Bias, Low Fee Make Monster of Initiative System, SACRAMENTO BEE (Feb. 20,
2012, 4:55 PM), http://www.sacbee.com/2012/02/20/4H7896/skelton-bias-low-fee-make-monster
.html.
14. Editorial, It’s Up to Brown to Get Pension Reform Results, FRESNO BEE, Feb. 15, 2012, at A15.
15. Editorial, Ballot Measures Summaries Should Be Written by Neutral Party, CONTRA COSTA
TIMES (Feb. 25, 2012), http://www.contracostatimes.com/ci_200040274.
16. Patrick McGreevy, GOP-led Drive for California Pension Initiative Dead for This Year, L.A.
TIMES (Feb. 8, 2012, 3:44 PM), http://latimesblogs.latimes.com/california-politics/2012/02/gop-led
-drive-for-california-pension-initiative-dead.html.
17. See, e.g., Clark, 2010 WL 928384, at 9 (Cal Ct. App. Mar. 16, 2010) (finding the verb
“reforms” to be argumentative in the context of a ballot label for an open-primary measure, and
ordering its replacement with “changes”).
18. Amador Valley Joint Union High Sch. Dist. v. State Bd. of Equalization, 583 P.2d 1281,
1298 (1978) (quoting Epperson v. Jordan, 82 P.2d 445, 448 (1938)); Yes on 25, 189 Cal. App. 4th at
1453.
19. Yes on 25, 189 Cal. App. 4th at 1453 (quoting Epperson, 82 P.2d at 448).
20. Lungren v. Superior Court, 48 Cal. App. 4th 435, 440 (1996) (quoting Amador Valley, 583
P.2d at 1298).
516
UC IRVINE LAW REVIEW
[Vol. 3:511
litigation occurs preelection and in a brief window of time. Few cases make it to
the court of appeal. Since the mid-1970s, the Superior Court of Sacramento
County has had exclusive jurisdiction over preelection challenges to the AG’s
handiwork,21 and this court’s opinions and transcripts are not available through
any electronic database. It may be helpful, however, to provide a few examples of
recent disputes and their resolution by the courts.22
Proposition 32 (2012). Proposition 32 would have disallowed unions and
corporations from donating money directly to candidates for elective office and
from spending money raised via payroll deductions on any political purpose.23 The
original ballot label produced by Attorney General Kamala Harris described
Proposition 32 as “restricting” union contributions to candidates.24 Proponents
sued, arguing that “‘[v]oters deserve to be informed that Prop. 32 doesn’t just
reduce direct contributions from corporations and unions to politicians, it
eliminates them entirely.’”25 The superior court agreed, ordering the word
“restricts” replaced with “prohibits.”26
Proposition 25 (2010). For many years, the California Constitution required
budgets to be adopted by a two-thirds vote.27 Proposition 25 changed the twothirds requirement to a simple majority requirement. The ballot title and label
prepared by Attorney General Jerry Brown stated that Proposition 25 “retains
two-thirds vote requirement for taxes.” The Chamber of Commerce sued, arguing
that this would mislead voters into believing that they had to pass the measure in
order to retain the supermajority rule for taxes. The superior court concurred, but
its judgment was vacated on appeal.28 Pointing to dictionary definitions of the
word “retain,” the court of appeal concluded that most voters probably would not
interpret the title as the Chamber of Commerce and the lower court had.29 The
Court also said the AG’s decision to state in the label that the measure did not
apply to taxes was reasonable “because debates regarding changes to California’s
budget process routinely include a discussion of whether the vote requirement for
raising taxes should be lowered.”30
21. CAL. GOV’T CODE § 88006 (West 2005); CAL. ELEC. CODE § 9092 (West 2003).
22. The following are drawn mostly from appellate opinions available on Lexis and Westlaw.
23. Proposition 32, LEGISLATIVE ANALYST’S OFFICE 4 (Nov. 6, 2012), http://www.lao.ca.gov/
ballot/2012/32_11_2012.pdf.
24. Judge Orders Changes to Prop. 32 Language, L.A. TIMES (Aug. 13, 2012, 6:56 PM), http://
latimesblogs.latimes.com/california-politics/2012/08/judge-orders-changes-prop-32-language.html.
25. Id.
26. Id.; see Titus v. Bowen, No. 34-2012-80001218 (Super. Ct. Sacramento Cnty. Aug. 13,
2012) (order granting in part and denying in part peremptory writ of mandate).
27. For some background on the California budget process and potential reforms, see Ethan
J. Leib & Christopher S. Elmendorf, Why Party Democrats Need Popular Democracy and Popular Democrats
Need Parties, 100 CALIF. L. REV. 69, 100–07 (2012).
28. Yes on 25, Citizens for an On-Time Budget v. Superior Court, 189 Cal. App. 4th 1445,
1452 (2010).
29. Id. at 1454.
30. Id.
2013]
ARE BALLOT TITLES BIASED?
517
Proposition 14 (2010). Proposition 14 established California’s nonpartisan
“top-two” primary election system by constitutional amendment. The amendment
was put on the ballot by the legislature, rather than by a private citizen or interest
group, and in passing Proposition 14 the legislature also specified that the
following label was to be used to describe it on the ballot:
ELECTIONS. PRIMARIES. GREATER PARTICIPATION IN ELECTIONS. Reforms the primary election process for congressional,
statewide, and legislative races. Allows all voters to choose any candidate
regardless of the candidate’s or voter’s political party preference. Ensures
that the two candidates receiving the greatest number of votes will appear
on the general election ballot regardless of party preference.31
After determining that the legislature is bound by the same obligations of
impartiality as the AG, the superior court and the court of appeal held the label
improper. The superior court replaced the title phrase “greater participation in
elections” with “increases right to participate in primary elections,” and added the
fiscal impact statement required of ballot labels authored by the AG.32 The court
of appeal deemed this fix inadequate because it left the argumentative word
“reform” in the label.33 The Court quoted dictionaries to show that “reform”
connotes melioration, and ordered the word replaced with “change.”34 But the
Court rejected opponents’ argument that the title phrase, “increases right to
participate in primary elections,” was argumentative or misleading.35
Proposition 8 (2008). Proposition 8, which qualified for the ballot shortly after
the California Supreme Court recognized a constitutional right to same-sex
marriage, amended the state constitution to limit marriage to opposite-sex couples.
The AG’s ballot label stated in relevant part:
ELIMINATES RIGHT OF SAME-SEX COUPLES TO MARRY.
INITIATIVE CONSTITUTIONAL AMENDMENT. Changes the
California Constitution to eliminate the right of same-sex couples to
marry. Provides that only marriage between a man and a woman is valid
or recognized in California.36
Proponents had argued for the following:
31. S.B. 19, 2009–2010 3d Extended Sess. (Cal. 2009).
32. Yes on 25, 189 Cal. App. 4th at 1451–52.
33. Id. at 1453–54.
34. Id. It also pointed to an earlier case concerning a ballot measure that would increase city
taxes on a utility plant. The title as originally written was, “Amendment of Utility Tax by Removing
Electric Power Plant Exemption.” The court determined that the word “exemption,” “particularly in
the tax context,” “connote[s] unfair influence and special treatment," and it ordered the city to write
“exclusion” in place of “exemption.” Id. at 8 (discussing and quoting Huntington Beach City Council
v. Superior Court, 94 Cal. App. 4th 1417, 1433–34 (2002)).
35. Id. at 1454.
36. Jansson v. Bowen, No. 34-2008-00017351, at *3 (Super. Ct. Sacramento Cnty. Aug. 7,
2008), available at http://ag.ca.gov/cms_attachments/press/pdfs/n1597_ruling_on_proposition_8
.pdf.
518
UC IRVINE LAW REVIEW
[Vol. 3:511
MARRIAGE. CONSTITUTIONAL AMENDMENT. Amends the
California Constitution to provide that only marriage between a man and
a woman is valid or recognized in California.37
Proponents contended that the AG’s label was argumentative and prejudicial
because it used “a strongly negative, active tense verb[—‘eliminates’—]to
characterize the effect of the measure.”38 The court disagreed.
Proposition 209 (1996). The ballot label for Proposition 209 read:
PROHIBITION AGAINST DISCRIMINATION OR PREFERENTIAL TREATMENT BY STATE AND OTHER PUBLIC ENTITIES.
INITIATIVE CONSTITUTIONAL AMENDMENT. Generally
prohibits discrimination or preferential treatment based on race, sex,
color, ethnicity, or national origin in public employment, education, and
contracting . . . .39
Opponents maintained that the label was misleading because the true
purpose of Proposition 209 was to end affirmative action by state and local
governments, not to end discrimination.40 In public statements, newspaper
editorials, and arguments in the ballot pamphlet, proponents had characterized the
measuring as “end[ing] affirmative action.”41 Relying on such extrinsic evidence of
intent, the superior court agreed with the plaintiffs.42 But the court of appeal
reversed, stating that “the title, summary and label provided by the Attorney
General are essentially verbatim recitations of the operative terms of the
measure,” using words that “are all subject to common understanding.”43
B. Do Ballot Labels Matter?
The fact that proponents and opponents of ballot measures litigate
seemingly small differences in the language of the label suggest that they believe
these details may affect whether a measure passes or fails. Their beliefs find
considerable support in the political science literature.
In the study most directly on point, Craig Burnett and Vladimir Kogan
conducted survey experiments on a national sample of voting age adults to
examine the effects of alternative ballot labels on vote intentions.44 Depending on
the treatment condition, subjects viewed either the official ballot label or a
37. Letter from Andrew Pugno, Attorney for Proponents of Proposition 8, to Attorney
General Initiative Coordinator (Oct. 15, 2007) (on file with authors).
38. Jansson, No. 34-2008-00017351, at *4.
39. All ballot labels discussed in this Article are on file with the authors.
40. Lungren v. Superior Court, 48 Cal. App. 4th 435, 441–42 (1996).
41. Id.
42. Id.
43. Id.
44. Craig M. Burnett & Vladimir Kogan, The Case of the Stolen Initiative: Were the Voters
Framed? ( June 15, 2012) (unpublished manuscript), available at http://papers.ssrn.com/sol3/papers
.cfm?abstract_id=1643448.
2013]
ARE BALLOT TITLES BIASED?
519
plausible alternate label, and subjects either received or did not receive prominent
interest group endorsements. The measures under study included California’s
same-sex marriage ban, a Colorado anti-abortion measure, and a San Diego
schools bond.45 In the no-endorsements condition, Burnett and Kogan found that
the alternate label resulted in an eight percentage-point swing in support for the
abortion and same-sex marriage measures, but had no effect on the school bond
measure.46 Providing respondents with interest group endorsements reduced the
magnitude of the label effect by about fifty percent,47 but even a four percentagepoint swing in reported vote intentions is quite striking.
One must be cautious about extrapolating from Burnett and Kogan’s study
to the real world. Their subjects included nonvoters as well as voters, and subjects
were asked how they would vote in either an informational vacuum or an
informational environment containing a single endorsement that the researchers
(rather than the subjects) deemed relevant. In the real world of ballot initiative
elections, voters may acquire information from political advertising, the ballot
pamphlet, or friends and neighbors—information that makes them less sensitive
to variations in ballot-label wording. Then again, most voters do not pay much
attention to politics,48 and the political economy of getting a measure on the ballot
tends to result in the selection of measures that provide concentrated benefits for
proponents and diffuse costs for everyone else.49 These measures do not engender
much campaign spending on the no side.50 As a consequence, voters are unlikely
to learn many endorsements that counsel in favor of a no vote, making them quite
dependent on the ballot label.51
Other findings generally corroborate the intuition that voters rely on ballot
45. The alternative labels were realistic and fairly chosen. In the case of the same-sex marriage
ban, the alternative label was the title and summary on the circulating petition. In the case of the
abortion measure, the alternative label was the label used for a similar measure in another state. In the
case of the school bond measure, the alternative stated that the increase in property taxes that would
be necessary to service the debt. Id. at 15–18.
46. Id. at 22–23.
47. Id.
48. See generally Christopher S. Elmendorf & David Schleicher, Informing Consent: Voter Ignorance,
Political Parties, and Election Law (UC Davis Legal Studies Research Paper Series No. 285, 2012)
(examining how law can influence generally uninformed voters to obtain meaningful representation).
49. Thad Kousser & Mathew D. McCubbins, Social Choice, Crypto-Initiatives, and Policymaking by
Direct Democracy, 78 S. CAL. L. REV. 949, 951–57 (2005).
50. Id.
51. Surveys of California voters done in the early 1990s found that fifty-four percent of
California voters say that they rely on the ballot pamphlet and that, of the voters who read pamphlets,
eighty to ninety percent reported that the pro/con arguments or names of endorsers (presented in
pro/con arguments) were especially helpful. SEAN BOWLER & TODD DONOVAN, DEMANDING
CHOICES: OPINION, VOTING, AND DIRECT DEMOCRACY 55–59 (1998). However, these numbers
may be exaggerated (respondents who wish to present themselves as “good” citizens may overreport
reading the ballot pamphlet, just as they overreport voting). And even read in the most favorable
light, they suggest that about fifty percent of voters do not glean useful information about
endorsements from the ballot pamphlet.
520
UC IRVINE LAW REVIEW
[Vol. 3:511
labels in initiative and referendum elections. Using opinion-poll data, Sean Bowler
and Todd Donovan demonstrated that self-interest explained much more of the
variation in vote intentions on a school voucher ballot measure when respondents
were provided with the proposition’s ballot description and name, rather than the
name alone.52 Without an accurate label, many voters were unable to derive a
position from their interests.
In an exit-poll study of voting on a California renewable energy measure,
Craig Burnett, Elizabeth Garrett, and Mathew McCubbins show that voters’
factual knowledge about the measure and awareness of endorsements had
essentially no impact on vote choice.53 Regardless of knowledge or endorsements,
there was strong support for the proposition among voters who said they
supported renewable energy even if electricity rates may rise, and strong
opposition among those who disagreed.54 A possible explanation is that the ballot
label—which everyone sees at the moment they vote—swamps other influences
on vote choice.55 Burnett has since replicated these findings with exit polls on
seven other ballot initiatives.56
Two other recent studies speak to the potential importance of ballot labels
for voting in direct democracy. Michael Binder examined twenty-two ballot
measures and found that anywhere from eighteen percent to more than fifty
percent of voters acknowledged being confused about the measure.57 For seven of
the measures, Binder asked voters about their policy preferences with respect to
the subject of the measure, and inferred the “correct vote” for each respondent.
The percentage of incorrect votes cast by respondents who acknowledged being
confused was five to fifteen points higher than the rate among nonconfused
voters. With respect to five of the seven measures, the erroneous votes cast by
confused voters basically canceled one other out, but the confused erred
systematically in favor of voting yes on the other two measures.58
52. Id. at 55–59.
53. Craig M. Burnett et al., The Dilemma of Direct Democracy, 9 ELECTION L.J. 305 (2010).
54. Id. at 314–17.
55. Id.
56. Craig M. Burnett, Informed Democracy? How Voter Knowledge of Initiatives Influences
Consistent Voting (2009) (unpublished manuscript), available at http://www.olemiss.edu/depts/
political_science/state_politics/conferences/2009/papers/20.pdf. Some caveats are in order. First,
Burnett’s measure of “knowledge of endorsements” is based on one or two endorsements he deems
most useful for voting on each measure. It is possible that voters relied on other endorsements,
knowledge of which Burnett did not measure. Second, Michael Binder’s study of some of the same
ballot measures finds that endorsement knowledge is correlated with voting correctly (i.e., in
accordance with the voter’s policy preference). See Michael M. Binder, Getting it Right or Playing it
Safe? Confusion, the Status Quo Bias and Correct Voting in Direct Democracy 126–27 (2010)
(unpublished Ph.D. dissertation, UC San Diego).
57. Binder, supra note 56, at 46–76.
58. Id. at 128–30.
2013]
ARE BALLOT TITLES BIASED?
521
Binder shows that voter confusion is correlated with individual-level
attributes such as interest in politics, relevant factual knowledge, and knowledge of
endorsements,59 and also strongly related to the “difficulty” of the issue.60 He did
not examine ballot-label effects, but it is plausible that complex ballot labels
contribute to voter confusion. It also seems likely that confused voters are
susceptible to being swayed by argumentative, prejudicial, or otherwise inaccurate
ballot labels. Certainly if we were in the business of drafting ballot labels to skew
the vote, we would be pleased to learn that anywhere from twenty to fifty percent
of voters are likely to be confused about the measure labels we drafted, and that
these confused voters are much more likely than others to vote against their policy
preferences.
In addition to causing “mistaken” votes, a complicated ballot label may lead
some voters to skip the measure or “roll-off”61 the ballot altogether. Studying
more than 1200 recent measures, Shauna Reilly and Sean Richey found a positive,
statistically significant correlation between voter roll-off and language complexity
in the ballot label.62 Their study relies on aggregate data, and the apparent causal
relationship between language complexity and voter abstention may or may not
hold up in future studies with individual-level data.63 But Reilly and Richey’s
results are at the very least suggestive and, along with Binder’s work, they motivate
one of the empirical strategies we pursue in this Article.
II. METHODS, HYPOTHESES, AND RESULTS
The principal barrier to studying bias in the labeling of ballot measures is the
lack of any agreed-upon method for ascertaining the existence and extent of bias.
If there is no room for reasonable disagreement about whether any given ballot
label is argumentative, prejudicial, or otherwise likely to mislead voters, there
would be no point in litigating the label because the case’s outcome would be a
foregone conclusion. Yet without a defensible, quantifiable measure of ballot label
bias, it would seem impossible to empirically assess the allegation that attorneys
general behave as strategic partisan actors when writing labels.
We propose two solutions to this problem. First, following Reilly and
59. Id. at 71–76.
60. Id. at 58–62. Binder measured issue difficulty using the classification system developed in
Edward G. Carmines & James A. Stimson, The Two Faces of Issue Voting, 74 AM. POL. SCI. REV. 78
(1980).
61. “Roll-off” refers to voters that complete the first part of a ballot but then leave downballot races and measures blank. Roll-off is the product of “voter exhaustion” or “voter fatigue” due
to various issues such as the complexity, length, or design of a ballot. See, e.g., Charles S. Bullock III &
Richard E. Dunn, Election Roll-Off: A Test of Three Explanations, 32 URB. AFF. REV. 71 (1996).
62. Shauna Reilly & Sean Richey, Ballot Question Readability and Roll-Off: The Impact of Language
Complexity, 20 POL. RES. Q. 1 (2009).
63. Binder, supra note 56, at 88–104 (showing that standard results linking voter confusion to
roll-off, which were established with aggregate data, do not hold up with individual-level data).
522
UC IRVINE LAW REVIEW
[Vol. 3:511
Richey, we code ballot labels using an objective measure of their readability. We
then investigate deviations from the readability norm, asking whether the pattern
of deviations is consistent with what one would expect from strategic partisan
actors. This empirical strategy does not establish that low-readability (or highreadability) labels are “biased” in the sense of using argumentative or prejudicial
language, or failing to convey the central purpose of the measure. Nor does it
establish that the average level of readability is optimal or fair. Rather, this strategy
does allow us to say whether AGs behave like strategic actors who seek to enable
or disable correct voting by citizens with limited reading ability.64
Our other strategy relies on ordinary observers’ perceptions of bias. Though
individual bias evaluations may be erratic (high variance) or distorted by the
prejudices of the observer, the average judgment of many observers should
approximate the truth if there’s a truth to be found. This intuition is grounded in
the Condorcet Jury Theorem.65 So long as each observer’s judgment of bias
contains some information as well as noise, and the observers’ collective errors are
not too highly correlated, the average of the observers’ bias opinions will converge
on the truth as the number of observers grows larger.66
The balance of this Part describes each of our methods in some detail,
explains the hypotheses we propose to test with each, and presents our results.
A. Readability and AG Strategy
1. Theory and Hypotheses
As noted above, Reilly and Richey found a strong correlation between ballot
label readability and voter roll-off.67 The more complex the ballot label, the greater
the likelihood that citizens who voted in the top-of-the-ballot race abstained from
voting on the proposition.68 It seems likely that the roll-off associated with
language complexity is more pronounced among voters whose reading ability is
limited.69 This problem represents something of an opportunity for strategic
attorneys general: If you do not like the way that low-reading-level (LRL)
populations are likely to vote on the initiative if they understand it, write a very
complex ballot label. Conversely, if you agree with the majority view among LRL
64. Note that our readability scores only capture the readability of the English-language
version of the ballot label. Translated versions could differ. We are grateful to Kevin Quinn for
raising this issue.
65. Condorcet Jury Theorem is a political science theorem that “establishes that under certain
conditions a majority of a group . . . is more likely to choose the ‘better’ alternative than any one
member of the group.” Krishna K. Ladha, The Condorcet Jury Theorem, Free Speech, and Correlated Votes,
36 AM. J. POL. SCI. 617, 617 (1992).
66. Id. at 618–19.
67. Reilly & Richey, supra note 62, at 64–65.
68. Id.
69. Reilly and Richey do not address this question.
2013]
ARE BALLOT TITLES BIASED?
523
populations, write an exceptionally simple label.70 These strategies should also
work if complicated labels increase incorrect voting rather than roll-off among
LRL populations.
An informal model may help to clarify our hypotheses. Assume that the
least-cost strategy for the AG is to write a label of ordinary complexity. Extremely
simple or extremely complex labels are more costly to write: they require more
effort to craft, they may subject the AG to accusations of mischief or impropriety,
and they may induce litigation and be rewritten by the courts.71 Whether a strategic
AG is willing to bear these costs depends on the payoff. The payoff is likely to
vary with (1) the strength of the AG’s own opinion on the issue or that of her
political backers, (2) the degree to which voter opinion is polarized on readinglevel lines, and (3) the expected closeness of the election.
The logic here is straightforward. The payoff from manipulating language
complexity is a function of the payoff from winning the election (factor 1) and the
likelihood that the manipulation will be outcome-determinative (factors 2 and 3).
The manipulation is more likely to affect the outcome insofar as the election is
likely to be close, and insofar as LRL voters who understand the measure all
support it or all oppose it, with high-reading-level (HRL) voters taking the
contrary position. If, instead, LRL voters split fifty-fifty on the merits of the
measure, then the effect of language manipulation inducing them to abstain or to
vote randomly would have no effect on whether the measure passes or fails.
Homogeneity of opinion among HRL voters matters too. If voters who
agree with the AG are concentrated among HRL populations, then the AG who
writes a complex label to disable LRL voters does not have to worry very much
about losing votes or inducing error among citizens with whom the AG agrees.
Conversely, the AG who writes an exceptionally simple label to help LRL citizens
probably forgoes some degree of detail or nuance, which may increase the
incorrect voting rate among HRL voters relative to baseline levels associated with
typical ballot labels. The greater the polarization of opinion by reading level, the
more this marginal increase in incorrect voting by HRL citizens benefits an AG
who shares the opinion of LRL citizens.
70. This strategy should work so long as more complex labels result in disproportionately
greater confusion among LRL voters compared to the electorate as a whole. Even if, as Binder finds,
confused voters do not abstain, voter confusion is likely to introduce a random element into choice.
As confusion grows, the share of people voting yes among the LRL population is likely to converge
on fifty percent. At the limit, this is functionally equivalent to abstention. In either case, the confused
have no impact on whether the ballot measure passes or fails.
71. This assumes that there is a positive or U-shaped relationship between language
complexity and the probability of judicial invalidation. That proposition is not certain, but it is
plausible insofar as the courts in reviewing ballot labels focus on whether the measures are likely to
mislead voters. See supra Part I.A.
524
UC IRVINE LAW REVIEW
[Vol. 3:511
Figure 1: Expected Relationship Between Ballot Label Complexity and Rates
of Voter Abstention and/or Error Among Strong and Weak Readers
High
Error / Abstention Rate
Low-reading-level voters
High-reading-level voters
Low
Low
Normal
High
Ballot Label Difficulty (higher scores = harder to read)
2. Operationalization
To test our hypotheses, we need measures of ballot label readability, the
strength of AG opinion about the proposition and the opinion of the AG’s
supporters, the expected closeness of the election at the time the label was drafted,
and the expected polarization of opinion by reading levels at the time of drafting.
Following Reilly and Richey, we code readability using the Flesch-Kincaid
Grade Level formula, which linguists have used for more than half a century to
estimate the number of years of education necessary to read and comprehend a
passage.72 The formula is based on the average sentence length and average
number of syllables per word in each passage.73
As for strength of AG opinion, we cannot observe how much the AG
personally cares about each measure.74 But we can observe how thoroughly the
72. James N. Farr et al., Simplification of Flesch Reading Ease Formula, 35 J. APPLIED PSYCHOL.
333, 334 (1951). We also tried several other measures of readability, none of which led to any
difference in our results (the measures of readability are highly correlated with one another).
73. The formula is 0.39 × (average sentence length) + 11.8 × (average syllables per word)
15.59. See id. at 334. The resulting number estimates the number of years of education necessary to
read and comprehend the passage. See J. PETER KINCAID ET AL., DERIVATION OF NEW READABILITY FORMULAS (AUTOMATED READABILITY INDEX, FOG COUNT AND FLESCH READING EASE
FORMULA) FOR NAVY ENLISTED PERSONNEL (1975).
74. We considered trying to create same-scale measures of the ideal points of each AG and
the cutpoints of each ballot measure, using roll-call votes to place AGs (who had served in legislative
bodies), newspaper endorsements to place the measures, and newspaper endorsements on
referendums to bridge the issue spaces. But creating the database of newspaper endorsements would
2013]
525
ARE BALLOT TITLES BIASED?
measure divides voters along party lines. The AG, a partisan official, surely faces
some pressure to toe the party line on issues where the parties have polarized. We
therefore use partisan polarization of opinion as a rough proxy for how much the
AG cares whether a proposition passes or fails.75
Ideally our metric of the measure’s divisiveness (by party) would be
contemporaneous with the drafting of the label. But contemporaneous polling
data is not available for all measures, and even if it was, use of polling data would
create other problems. For example, opinion polls include nonvoters. Also,
partisan divisiveness at the time of drafting may be latent. Voters will not have
paid attention to the measure yet, or received signals from party elites. Thus, polls
taken at the time of drafting may grossly understate the eventual polarization of
the electorate, and the AG, as a forward-looking partisan official, probably cares
about elite opinion and anticipates how the electorate will react in the future.
To avoid these problems, we proxy partisan divisiveness using actual votes
on the ballot measure. We ranked California’s eighty assembly districts by the
presidential vote share for each major party. We then identified the six most
Republican-leaning and six most Democratic-leaning districts, and for each ballot
measure we subtracted the average yes vote in the heavily Democratic districts
from the average yes vote in the heavily Republican districts.76 Finally, we
normalized the result. This gives us a rough metric of the measure’s
“conservativeness.”77 Propositions with respect to which Republican and
Democratic districts voted similarly have a conservativeness score of zero.
Propositions that received more yes votes in Republican than in Democratic
districts have a positive score, whereas propositions that did better in Democratic
districts have a negative score. As a proxy for the measure’s divisiveness (by
party), we use the absolute value of conservativeness.
have required an enormous investment of resources, and, in any event, the undertaking would not
have yielded ideal points for all of the AGs in our dataset because some neither served in the
legislature (or as governor), nor ran against an opponent who had served in these capacities.
75. It is an imperfect proxy because it does not capture the AG’s personal strength of feeling,
or the depth of concern on the part of interest groups that have supported the AG and have money
to burn.
76. The partisan divisiveness ratings are robust to small changes in the number of districts
included; six is admittedly arbitrary. The scores are nearly identical whether we use four or eight
districts.
Table 1: Pairwise Correlation Matrix of Polarization Scores
Using 4 Districts
Using 6 Districts
Using 8 Districts
Using 4 Districts
1.000
0.997
0.994
Using 6 Districts
Using 8 Districts
1.000
0.998
1.000
77. This can also be understood as a measure of the ballot initiative’s “Republicanness,” but
that term is even more awkward than “conservativeness” so we will use the latter throughout.
526
UC IRVINE LAW REVIEW
[Vol. 3:511
This measure of divisiveness is imperfect. It is posttreatment in the sense
that we are measuring opinion given the ballot label the AG drafted, rather than
the degree of partisan division that would have occurred had the AG drafted a
“normal” ballot label. If competitive ballot measures on which the electorate is
latently divided by party tend to induce crafty ballot labels, and if these labels tend
to dampen the expression of partisan disagreement (for example, by increasing the
rate of incorrect voting), then a posttreatment measure of divisiveness by party
will understate the latent partisan divide at the time of drafting in some cases.
There is also a potential ecological inference problem with our measures of
conservativeness and divisiveness. Because we rely on aggregate data, we cannot
be sure that Republicans and Democrats were as divided on the measure as our
results suggest. For example, it is possible that some of the apparent divisiveness
(as we measure it) is due to relatively conservative Democrats who live in
Republican-dominated districts voting for conservative measures and against
liberal ones, or relatively liberal Republicans who live in Democratic districts
voting more like Democrats. That said, we are fairly confident that our rough
measure of divisiveness by party helps to identify measures that divide liberals and
conservatives,78 and other things equal, we expect the AG to care more about and
to face more political pressure on such measures.79 We also take some comfort in
the fact that the divisiveness scores remain nearly identical if we substitute
precinct-level for district-level election returns.80
We proxied the expected competitiveness of ballot measures by calculating
the absolute value of the difference in statewide vote totals (yes and no) for each
ballot measure. We then inverted the result so that small values represent relatively
noncompetitive measures and large values represent highly competitive
measures.81 Finally, we normalized the result. Like our measure of polarization,
our measure of competitiveness is posttreatment. The ideal measure would be
pretreatment and contemporaneous with the drafting of the label, since strategic
manipulation of ballot labels is hypothesized to vary with the expected closeness
rather than the actual closeness of the election. But the same obstacles that
78. Many researchers before us have used presidential vote share to proxy the liberalism or
conservatism of voters in a legislative district. See, e.g., NOLAN MCCARTY ET AL., POLARIZED
AMERICA: THE DANCE OF IDEOLOGY OF UNEQUAL RICHES 63–66 (2006). More importantly, it has
recently been shown that presidential vote share is highly correlated with the ideological position of
the median voter in California’s assembly districts. See Seth E. Masket & Hans Noel, Serving Two
Masters: Using Referenda to Assess Partisan Versus Dyadic Representation, 65 POL. RES. Q. 104 (2012).
79. Of course, other things are not always equal. The AG may well face pressure on measures
that are not ideologically divisive, but have strong support or opposition from high-spending interest
groups or individuals. (Indian casino measures are perhaps an example.) In future work, it might be
worthwhile to code measures by the degree to which campaign spending is concentrated (just a few
donors) or diffuse (lots of donors).
80. We compared our Assembly-based polarization scores from 2002–2010 to precinct-level
data and the polarization scores were correlated at 94.4%.
81. Competitiveness = abs(yea nay) × (1).
2013]
ARE BALLOT TITLES BIASED?
527
prevent us from obtaining a good contemporaneous measure of partisan
polarization also prevent us from obtaining a good contemporaneous measure of
competitiveness.
As a very rough proxy for the expected degree of polarization by reading
level, we used a dummy variable set to one if a measure is redistributive, and zero
otherwise. We hypothesize that strategic AGs will generally expect polarization by
reading level on redistributive measures, given the well-known correlation between
educational attainment and income, and given previous work showing that
support for redistributive ballot measures is inversely related to the proportion of
college-educated voters in a district.82
Each author independently classified each ballot measure as redistributive or
not, based on the LAO’s description of the measure. Measures were deemed
redistributive if, in the coder’s opinion, an ordinary voter’s attitude toward
redistribution from the rich to the poor would have a substantial effect on
whether she supports or opposes the measure. Measures were therefore coded as
redistributive if they would extract resources disproportionately from higher
income/wealth populations and/or confer benefits disproportionately on lower
income/wealth populations, or if they would do the opposite or erect political
barriers to redistribution (e.g., supermajority requirements for tax increases). Thus,
our redistribution dummy captures both liberal and conservative measures.83 (In
the results reported below, measures are coded as redistributive only if both
authors agreed that the measure was redistributive.84)
3. Results
As Table 2 and Figure 2 show, our results provide only partial support for
our hypotheses. Figure 2 plots the readability and conservativeness scores of each
ballot initiative. Higher readability scores mean the ballot label was more difficult
to read. The curves show the relationship between conservativeness and
readability for Democratic- and Republican-authored labels. Assuming that LRL
populations tend to be liberal, we expected a U-shaped relationship between the
82. James M. Synder, Jr., Constituency Preferences: California Ballot Propositions, 1974–1990, 21 LEG.
STUD. Q. 463 (1996). Snyder shows that, during the period of his study (roughly half the period of
ours), California voter preferences on ballot measures had three dimensions: a redistribution
dimension, an environment/civil rights/public goods dimension, and a North-South dimension
(mostly about water conflicts). Snyder finds that support for measures that load heavily onto the
redistribution dimension is greatest in assembly districts with more Democratic and fewer college educated
voters. Id. at 465, 471–75.
83. We also included measures in the redistributive category if they provided equal benefits to
everyone but would disproportionately benefit low-income voters. (Imagine a proposition that would
establish free, state-provided health care and make it available to everyone.)
84. Our individual assessments of redistributive measures were highly correlated
(Krippendorff’s alpha = 0.911). One author identified fifty-eight measures as redistributive and one
identified fifty-five. Our sample is limited to the fifty-three measures where there was inter-coder
agreement.
528
UC IRVINE LAW REVIEW
[Vol. 3:511
conservativeness of a ballot measure and the complexity of Republican-authored
labels. If manipulating ballot labels is costly, AGs are strategic, and the payoff that
AGs receive for the passage or defeat of a ballot measure increases with the
measure’s ideological divisiveness, we expected to obtain this result. Republican
AGs will put in more effort to confuse LRL voters on very liberal and very
conservative propositions. The U-shape should be most pronounced on
competitive and redistributive propositions. Conversely, we expect to see an
inverted-U relationship between ballot measures’ conservativeness score and the
readability of Democrat-authored labels, as Democratic AGs should be especially
keen to avoid confusion among low-level readers on very liberal and very
conservative measures.
The expected U-shaped and inverted-U-shaped relationships between AG
party, ballot-measure conservativeness, and readability did not generally
materialize. However, Republican AGs did write increasingly hard-to-read ballot
labels on competitive measures that were more conservative than average. This effect
is statistically significant. Graphically, the shaded gray region in the top-right panel
of Figure 2 is a ninety-five percent confidence interval on readability; the upper
bound of the interval when conservativeness equals zero (which describes a
measure that is neither conservative nor liberal) is lower than the lower bound of
the interval for the most conservative measures.
The top left panel in Figure 2 shows that there is no relationship between
readability and conservativeness for Republican-authored labels on less
competitive measures. The fact that Republican AGs have written hard-to-read
labels only on competitive conservative measures tends to corroborate our
strategic partisanship hypothesis. Note that Republican AGs did not write
especially hard-to-read labels on redistributive measures as we hypothesized.
However, this nonresult may be an artifact of an extremely small sample size. Our
dataset contains only two conservative, competitive, redistributive measures
whose labels were authored by a Republican AG.
Though Republican AGs’ writing of increasingly hard to read labels on very
conservative competitive measures is consistent with our hypotheses, the fact that
Republican AGs did not also write more difficult labels for very liberal measures is
puzzling. So too is the absence of the predicted relationship between readability
and ballot-measure conservativeness for labels authored by Democrats. Indeed, as
Table 2 shows, labels produced by Democratic AGs tend to be slightly harder to
read than labels generated by Republican AGs. This is so whether one looks at the
data in the aggregate or subsets it by competitiveness, divisiveness, or
redistributiveness.
2013]
ARE BALLOT TITLES BIASED?
529
Table 2: Average Flesch-Kincaid Grade-Level Readability Scores for Ballot
Labels Conditioned on the Competitiveness and Divisiveness of the Measure,
and on the Party of the AG that Authored the Ballot Label
50% Most Competitive
Redistributive
Other
12.4
13.4
11.7
12.4
Democratic AG
Republican AG
All
13.1
12.2
Democratic AG
Republican AG
50% Most Divisive (by Party)
All
Redistributive
Other
13.1
12.4
13.5
12.3
10.7
13.4
All
12.9
11.5
50% Least Competitive
Redistributive
Other
12.9
12.9
11.2
11.7
50% Least Divisive (by Party)
All
Redistributive
Other
12.9
13.1
12.9
11.4
12.6
11.1
What accounts for our limited results? It may be the case that the utility of
passing conservative ballot measures is higher for Republican AGs than the utility
of defeating liberal measures, although why this might be so is unclear. It may be
the case that Democratic AGs have failed to see the potential payoff from
manipulating language complexity. But this strikes us as unlikely; scholarly and
elite attention to the readability of state-provided voter information materials is
not new.85 A third possibility is that the Flesch-Kinkaid algorithm does a poor job
capturing the readability of very short texts (recall that the ballot labels are no
longer than seventy-five words). But if measurement error was the story, one
would not expect a strong correlation between readability and roll-off, as Reilly
and Richey found, nor the relationship between conservativeness and readability
for Republican-authored competitive measures that we found.
Perhaps norms and rules internal to the AG’s office have professionalized
the writing of ballot labels in ways that limit systematic manipulation of language
complexity. Consistent with this hypothesis, Reilly and Richey report that the
variance in the readability of California ballot labels is quite low compared to
variance in the readability of ballot labels in most other states.86 Our limited but
suggestive results should certainly provide impetus for future investigations of
strategic partisanship in the labeling of ballot measures in other states.
85. For an early and influential contribution, see DAVID B. MAGLEBY, DIRECT
LEGISLATION: VOTING ON PROPOSITIONS IN THE UNITED STATES (1984). See also PHILIP L.
DUBOIS & FLOYD F. FEENEY, IMPROVING THE CALIFORNIA INITIATIVE PROCESS: OPTIONS FOR
CHANGE 138–40 (1992).
86. California has the fourth lowest variance in readability scores for labels among the fortysix states with statewide ballot measures between 1997 and 2007. Standard deviations ranged from 1.1
to 26.4 (in grade-level units). The standard deviation of California’s readability scores was 1.8. See
Reilly & Richey, supra note 62, at 63.
530
UC IRVINE LAW REVIEW
[Vol. 3:511
Figure 2: Flesch-Kincaid Grade-Level Readability Scores
for Every California Ballot Measure 1976–201087
B. Bias as Ordinary Observers See It
1. Theory and Hypotheses
If the AG tries to write ballot labels that induce wavering or confused
citizens to vote in accordance with the AG’s preferences, these efforts should be
detectable by dispassionate observers who closely study the text of the ballot
measure and the AG’s label. But observers may often err in evaluating bias. Some
errors may be random noise. Others may be predictable, as each observer brings
to the task of evaluating bias her own preconceptions and prejudices, as well as
varying levels of care or capacity.
87. The top panels are all measures (N=186). The middle panels represent measures that the
authors identified as redistributive (n=53). The bottom panels represent all nonredistributive
measures (n=133). The x-axis is ballot-measure conservativeness as measured by the difference
between presidential vote share in highly Republican assembly districts and highly Democratic
assembly districts. Curves are locally weighted least squares (LOWESS) smoothers. The shaded region
in the top right quadrant represents a ninety-five percent confidence interval.
2013]
ARE BALLOT TITLES BIASED?
531
If individual perceptions of bias contain an element of truth but also vary
with the observer’s capacities, prejudices, and preconceptions, then one can test
the hypothesis that AGs behave strategically in writing ballot labels by modeling
observers’ perceptions of bias as a function of (1) a vector of individual-level
covariates (traits likely to be correlated with perceived bias), and (2) a vector of
political covariates (circumstances that strengthen or soften the AG’s incentive to
write biased labels). Of course, if perceptions of bias contain only a little
information about actual bias, the researcher must obtain a large number of bias
observations in order to detect the influence of AG partisanship.
We hypothesize that, at the individual level, bias evaluations will vary
depending on whether the observer supports or opposes the measure, and
whether the observer is a skilled reader. We also expect bias perceptions to
depend on whether the observer thinks the ballot initiative process benefits the
public generally or mainly benefits special interests, whether the observer is
generally trustful of government and other people, and whether the observer
identifies with the minority party in her state. As we explain below, we tried to
minimize these influences by placing our observers behind a veil of ignorance
regarding the author of the labels under study.88
Support/Opposition. Researchers have found that opinions about public policy
color observers’ perceptions of all sorts of matters touching on the policy
question, ranging from the lessons of scientific studies to the existence of
ambiguity in a statute.89 It is also well established that losers in the political
process—citizens who voted for the candidate or party that did not prevail—are
much more likely than winners to believe the process to be unfair.90 Though our
subjects were not told whether the measure under evaluation passed or failed, we
think subjects may rationalize the possibility of defeat, as it were, by imputing bias
to the label ex ante. That is, voters who support a measure will be more likely to
see bias in favor of a no vote, and voters who oppose the measure will be more
likely to see bias in favor of a yes vote.
Regarding the probability of perceiving bias (as opposed to the direction of
perceived bias), we draw two mutually inconsistent hypotheses from the literature.
88.
89.
See infra Part II.B.2, specifically text accompanying notes 98 –99.
See, e.g., Ward Farnsworth et al., Ambiguity About Ambiguity: An Empirical Inquiry into
Legal Interpretation, 2 J. LEGAL ANALYSIS 1 (2010); Robert J. MacCoun & Susannah Paletz,
Citizens’ Perceptions of Ideological Bias in Research on Public Policy Controversies, 30 POL. PSYCHOL. 43
(2009).
90. See generally CHRISTOPHER J. ANDERSON ET AL., LOSERS’ CONSENT: ELECTIONS AND
DEMOCRATIC LEGITIMACY (2005) (examining how election losers and their supporters respond
to their loss and how institutions shape losing); Michael W. Sances & Charles Stewart III,
Partisanship and Voter Confidence, 2000–2010, (MIT Political Science Dep’t Research Paper No.
2012-12, 2012), http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2035513 (exploring to what
degree voter confidence in the fairness and trustworthiness of election procedures is driven by
satisfaction in the outcome of an election).
532
UC IRVINE LAW REVIEW
[Vol. 3:511
The first hypothesis posits that supporters of a ballot measure will be less likely to
perceive bias than opponents. This hypothesis is grounded in the large body of
work showing that voters who supported the winning candidate in an election are
more likely to perceive the electoral process as fair or legitimate than voters who
supported the loser.91 We did not tell subjects whether the ballot measure passed
or failed, but the simple fact that a proposition has appeared on the ballot may
count as a partial victory—a step toward victory—for supporters, and as a partial
loss for opponents. If winning or losing at the agenda-setting stage has a similar
(albeit weaker) effect on perceived fairness as winning or losing the election itself,
then supporters of a ballot measure should be less likely to perceive bias than
opponents.
The other hypothesis holds that the probability of perceiving bias will vary
with the strength rather than the direction of support or opposition. On this view,
people who care a lot about whether a measure passes or fails are more likely to
perceive bias than people who only mildly support or oppose it. People whose
support or opposition is strong will have firmer views about how the label should
be written, and they will be alert to possible mischief by the “other side.” It will
also be harder for them to accept that a label that does not perfectly capture their
understanding of the measure’s merits or demerits may nonetheless be reasonable.
The strength-of-opinion hypothesis finds some support in a recent study of
perceptions of statutory ambiguity.92 Law students who had firm, outcome-based
preferences between two competing interpretations of a statute were much less
likely to see the statute as ambiguous than students who had weak preferences.93
Reading Skills. We hypothesize that observers with stronger reading skills will
be more likely to find that a label selectively includes or excludes information
about the measure, in a manner calculated to induce a yes or no vote. We assume
that better readers will learn more about the measure from the background
materials we provide (more on this below), and that the more the reader knows
about the measure, the more likely she is to find flaws in editorial decisions about
what to include or exclude from the seventy-five-word label. We do not expect
strong reading skills to increase the probability of perceiving other forms of bias.
Trust in Government, Society, and Direct Democracy; Minority-Party Identification.
Outside the laboratory, ordinary citizens’ perceptions of ballot-label bias probably
have a lot to do with how the observer feels about government and society. Most
obviously, observers who think the ballot-initiative process tends to benefit a few
special interests rather than the public generally will be more likely to perceive bias
than observers who think the ballot-initiative process serves the public good.
Among citizens who know that a partisan, elected official authors ballot labels,
91. See supra note 89.
92. Farnsworth et al., supra note 89, at 257 (investigating statutory ambiguity in the law
through a survey study administered to nearly 1000 law students).
93. Id. at 271.
2013]
ARE BALLOT TITLES BIASED?
533
identification with the state’s minority party probably increases the likelihood of
perceiving bias. The likely effects of trust in government and social trust on
perceived bias probably depend on what the observer knows or assumes about the
identity of the ballot label’s author. Observers who believe the label was written by
a government official are probably more likely to perceive bias if they lack trust in
government. Conversely, observers who think the label was written by the
measure’s proponents will probably perceive more bias if they lack social trust,
that is, confidence in the fairness and integrity of their fellow citizens.
Because our primary goal in eliciting ordinary observers’ bias perceptions is
to obtain information about actual bias, we tried to minimize the influence of trust
in government, minority-party identification, and the like by not revealing the
label’s author. Subjects were told that the label “may or may not” be the actual
label that appeared on the ballot, and “may or may not have been prepared by a
neutral and disinterested person.”
AG Strategy and the Political Covariates of Bias. We hypothesize that labels for
measures that the AG expects to be competitive are more likely to be seen as
biased than labels for uncompetitive measures. As explained in Part II.A, strategic
AGs have greater incentives to write biased labels if the election is expected to be
close. We also predicted more perceived bias for measures that the AG or the
AG’s supporters strongly support or strongly oppose. A strategic AG will be more
willing to bear the costs of writing a biased label insofar as either the AG or the
AG’s most important constituencies cares deeply about the outcome. We expected
the direction of perceived bias to track the political incentives of the author of the
label. That is, we expected to find perceived bias in favor of a yes vote on
conservative measures labeled by a Republican AG and on liberal measures
labeled by a Democrat, and bias in favor of a no vote on liberal measures labeled
by a Republican AG and conservative measures labeled by a Democrat.
2. Operationalization
Implementing the ordinary observer strategy for estimating ballot-label bias
presents a number of challenges. For starters, who should one use as “ordinary
observers”? The ideal subjects for a study of California ballot propositions would
be drawn at random from voting-age Californians—the people on whose behalf
the attorney general acts when labeling ballot measures. Use of a true probability
sample of voting-age Californians would also make it likely that measures of
ballot-label bias incorporate much of the dispersed information about how voters
are likely to see and understand the ballot labels under study.
We decided, however, to use convenience samples of students rather than a
broader sample of Californians. We did this for reasons of cost and convenience,
and because we wanted to ensure that our subjects were able to read and take
notes on a hard copy of a long description of the ballot measure (more on this
below). Had we tried to survey a more representative group of Californians, we
534
UC IRVINE LAW REVIEW
[Vol. 3:511
would have had to provide all materials to our subjects in electronic form, which
would have made the materials harder to read and assimilate. We also would have
had much less confidence that our subjects read the materials with any degree of
care. Our subjects either completed the survey as a class assignment or took it
under our supervision in an experimental lab.
We were concerned that a sample comprised of California students would
contain too few conservatives for us to gauge whether agreement with
conservative propositions or disagreement with liberal ones affects perceptions of
ballot-label bias. We therefore recruited participation by students from Utah as
well as California universities.94
The next implementation challenge was to provide students with a point of
reference for evaluating bias. According to the courts, an unbiased label is one that
reasonably informs voters about the gist of the ballot measure, without taking
sides.95 Evaluating bias therefore requires the observer to understand both how
the ballot proposition would change state law and the likely effects of those
changes. It would be a heroic undertaking for ordinary observers to wade through
the text of a ballot measure, figure out exactly how it would change state law,
gauge the likely effects of these changes, and finally judge whether the label fairly
portrays the gist of the measure. California law recognizes as much, in that it
directs the nonpartisan LAO to prepare a careful summary of each measure’s
principal provisions and likely effects for the ballot pamphlet.96
To simplify the observers’ task, we asked them to evaluate bias in the ballot
label relative to the description of the measure in the ballot pamphlet, rather than having
them read and try to grasp the text of the measure itself. One might object to this
strategy on the ground that the LAO’s characterization of the proposition and its
effects may itself be biased. If so, a ballot label that diverges from the legislative
analyst’s description of the measure could be less biased than a label that’s faithful
to the description.
We think our approach is nonetheless defensible because leading politicians
and commentators who see the AG as biased favor delegating the authorship of
ballot labels to the LAO,97 which is widely regarded as impartial. If our results
show that AGs are faithful to the LAO’s characterization of a measure, then
giving the task to the LAO is not likely to reduce bias. If, instead, our results show
that AGs write biased measures (relative to the LAO’s description) in
94. We are extremely grateful to Quin Monson and Laura Marostica at Brigham Young
University and to Sean Farhang at UC Berkeley who permitted us to administer our survey to their
students. In addition, 242 students took the survey in UC Berkeley’s Experimental Social Science
Laboratory (XLab).
95. See supra Part I.A.
96. See supra Part I.A.
97. See supra text accompanying notes 3–4.
2013]
ARE BALLOT TITLES BIASED?
535
circumstances where the AG has political incentives to do so, then our results
would provide some support for the proposed reform.98
We did not tell our subjects that the long description for each measure had
been prepared by the LAO, because we thought this might result in judgments of
ballot-label bias becoming colored by opinions about the legislature or (possibly)
the LAO. Instead, subjects were told that the long descriptions had been prepared
by a neutral, disinterested expert.
Subjects received a packet containing the legislative analyst’s descriptions of
three ballot propositions, and were instructed as follows:
“Ballot initiatives” are citizen-proposed reforms that become law if
approved by a majority of voters. They are described on the ballot with a
title and a short written summary of their principal provisions and effects.
In this study, we are collecting information about whether such titles and
descriptions are perceived to be biased.
You will be asked to evaluate 3 ballot initiatives. For each ballot
initiative, you will first carefully read a neutral, 1–2 page description of
the measure and its likely effects, prepared by a disinterested expert (and
included in this pamphlet). You will then read a short title and summary
of the measure on the online survey. The short title and summary may or
may not be the actual title and summary that appeared on the ballot, and
it may or may not have been prepared by a neutral and disinterested
person. We want to know whether you think the title/summary
accurately and fairly describe the measure, or whether you think the
title/summary is biased in some way.99
After reading through the rest of the instruction sheet, subjects logged onto
a Qualtrics webpage, which prompted them to read carefully the first description
98. To be sure, it is formally possible that the legislative analyst could be biased in the
opposite direction of the AG in precisely those circumstances where the AG has a political incentive
to write biased labels, but that seems quite unlikely. It would mean that the nonpartisan and widely
respected legislative analyst behaves like a Democrat when the AG is a Republican, and like a
Republican when the Democrat is an AG. If nothing else, a finding (based on ordinary observer
judgments of ballot-label bias relative to the legislative analyst’s description) that ballot labels are
biased in the manner expected of a strategic, partisan AG would serve up the question of whether it’s
the AG or the legislative analyst who is biased.
99. Christopher S. Elmendorf & Douglas M. Spencer, Are Ballot Titles Biased? Partisanship and
Ideology in California’s Supervision of Direct Democracy, 3 U.C. IRVINE L. REV. 511 app. B (2013), available at
http://www.law.uci.edu/lawreview/vol3/no3/elmendorf_spencer_survey.pdf. Although California
law uses the phrase “title and summary” to refer to the AG-written passage introducing a proposition
in the ballot pamphlet, and the phrase “ballot label” to refer to the AG-written description on the
ballot itself, CAL. ELEC. CODE § 9051 (2003), we described the label to our subjects as a “title and
summary” for that seemed to us much more natural than the descriptor “ballot label.”
536
UC IRVINE LAW REVIEW
[Vol. 3:511
in the pamphlet.100 The next Qualtrics screen displayed the corresponding ballot
label, followed by a series of questions about perceived bias.
Unlike people who have been asked to estimate the number of jellybeans in a
jar, or the weight of a cow, the observers asked to gauge bias in a ballot label may
have different ideas about what constitutes bias. We therefore asked our subjects
not to assess bias generically, but rather to look for particular forms of bias as we
defined them:
Prejudicial language. Does the title/summary use colorful or symbolic
language, creating a risk that some voters who know little about the
measure will end up voting for or against it because of their emotional
response to the colorful or symbolic language?
Argumentative. Does the title/summary make an argument for or
against the proposed measure?
Selective information. Does the title and summary selectively exclude
or include information about the measure in a way that may cause some
voters who know little about the measure to vote differently than they
would have voted if they had carefully read the long description prepared
by the neutral expert?
Other (please describe). Is the title/summary otherwise biased for or
against adoption of the proposed measure?
These definitions of bias track those used by the courts when reviewing
ballot labels, and ballot titles and summaries.101
Subjects who reported perceiving bias received a follow-up question about
the direction of bias, with four answer options: (1) strongly biased toward no vote;
(2) mildly biased toward no vote; (3) mildly biased toward yes vote; and (4)
strongly biased toward yes vote.
After answering the suite of bias questions for each measure, subjects were
asked, “Based on the long description of the measure you read in the pamphlet,
how would you vote on this measure?” The corresponding answer options were:
(1) definitely vote no; (2) probably vote no; (3) probably vote yes; (4) definitely
vote yes.
The survey concluded with several questions designed to tap other
individual-level attributes hypothesized to vary with perceived bias.102 We
measured trust in government, using the standard four-question battery developed
by the American National Election Survey;103 social trust, using both the
100. Qualtrics is a leading online survey platform that we used to create the survey. We
customized the structure and design of the survey using Qualtrics’ web-based software and then
distributed a URL to all respondents that directed them to the site.
101. See supra Part I.A.
102. These are reported in full in Appendix B. Elmendorf & Spencer, supra note 99, at app. B.
103. See Trust in Government Index 1958–2008, AM. NAT’L ELECTION STUD. (2010), http://
www.electionstudies.org/nesguide/toptable/tab5a_5.htm.
2013]
ARE BALLOT TITLES BIASED?
537
conventional General Social Survey (GSS) questions and alternatives developed by
the economist Edward Glaeser;104 and trust in the ballot-initiative process, for
which we adapted one of the ANES trust-in-government questions.105 To measure
reading ability, we asked subjects about their score on the “Critical Reading”
(Scholastic Aptitude Test, or SAT) or “Reading” (ACT) section of their college
entrance exam.106
Because we expected observers’ judgments of bias to be noisy and colored
by individual-level traits, we figured we would need a substantial number of
observations to detect strategic bias in ballot labels. Between 1974 and 2010 the
California attorney general authored 185 ballot labels. We randomly selected
ninety ballot labels for observation, stratified by attorney general: approximately
sixteen labels per each of the six AGs during this time period.107 We obtained a
total of 996 ballot-label evaluations from 255 California and 77 Utah students,108
roughly ten observations per measure.
All nondichotomous independent variables have been normalized to
facilitate interpretation of the results.
3. Results
No doubt primed by our instructions, which cautioned that the labels “may
or may not have been prepared by a neutral and disinterested person,” the
students found bias at strikingly high rates.109 Seventy-four percent of the ballotlabel observations include a finding of bias. By far the most common form of
perceived bias is selective inclusion or exclusion of information. Students found
the label to be faulty in this way in fifty-seven percent of the cases; by contrast, the
prejudicial language, argumentative language, and “other” forms of bias were
detected in thirty-five, thirty-three, and thirty percent of the cases, respectively.
Tables 3 and 4 report the results of a linear probability model (LPM) in
104. Edward Glaeser et al., Measuring Trust, 115 Q. J. ECON. 811 (2000).
105. Here the question we posed: “Would you say that laws enacted through the ballot-initiative
process pretty much benefit a few big interests looking out for themselves or that they benefit all
people?”
Because many of our observers are from Utah, a state that does not have the ballot initiative, we
included the following definition as part of this question: “In states with the ballot initiative, measures
proposed by individual citizens or groups are put on the ballot if the proponents gather signatures
from a certain number of registered voters, and these measures become law if approved by a majority
vote at the next election.”
106. Cf. Cheryl Boudreau, Closing the Gap: When Do Cues Eliminate Differences Between Sophisticated
and Unsophisticated Citizens?, 71 J. POL. 964 (2009) (using SAT math scores as a measure of
experimental subjects’ ability).
107. Our sample represents the following number and percentage of overall ballot labels
authored by each AG: Younger (n=7, 100%); Deukmejian (n=13, 100%), Van De Kamp (n=16,
33%); Lungren (n=16, 38%); Lockyer (n=16, 34%); Brown (n=16, 62%).
108. See supra note 83.
109. Summary statistics are presented in Appendix A.
538
UC IRVINE LAW REVIEW
[Vol. 3:511
which the dependent variable is whether the observer perceived a particular form
of bias and the independent variables include SAT score, trust in government,
trust in direct democracy, social trust, minority-party identification, the measure’s
divisiveness (by party), and the competitiveness of the election. The divisiveness
and competitiveness variables are the same ones we used in the readability study
above. In Table 3, we include a dummy for whether the observer had “definite”
vote intentions on the measure. In Table 4, we replace this variable with a dummy
capturing whether the observer would definitely or probably vote yes.
The first column in these tables (“at least one type of bias”) reports results
from models that use an aggregated measure of perceived bias as the dependent
variable. The dependent variable equals one if the observer perceived one or more
types of bias in the label, and zero otherwise. We consider the results using this
dependent variable to be most informative because, save for the predicted effect
of reading ability, our hypotheses are not specific to particular types of perceived
bias.
Our decision to utilize linear regression is predicated on Occam’s razor. The
mechanics of LPM are easy to understand and the results are simple to interpret:
the coefficients represent the effect (shift in probability) on the dependent variable
of a one unit change in the independent variable—for dichotomous variables a
categorical change, and for normalized continuous variables a shift from the mean
value to one standard deviation above the mean. Coefficients in traditional
nonlinear models (e.g., logit and probit) are more difficult to interpret, especially
coefficients on interaction terms.110
Looking at the results, we see no relationship between the strength of vote
intention and the probability of perceiving bias. People who are really confident
about how they would vote on a measure are no more likely to perceive bias than
people who are wavering. This tends to undercut the hypothesis that strongly
opinionated observers will see bias at a higher rate than observers who care less.111
110. Chunrong Ai & Edward C. Norton, Interaction Terms in Logit and Probit Models, 80 ECON.
LETTERS 123 (2003).
111. This null result may reflect a problem in our survey instrument. We asked subjects
whether they would “definitely” or “probably” vote yes (or no) on the measure as described in the
pamphlet. The definiteness of vote intentions may reflect subjects’ political knowledge as much as it
does their level of care or concern about the issue. In later survey administrations, we asked a subset
of respondents (n=46) how much they would care if each particular measure were passed—a lot,
moderately, a little, or not at all—and note that this measure is uncorrelated to respondents’ voting
preferences (Pearson’s r =0.06). However, this measure of concern about the issue was also not
predictive of bias, nor did it alter any of the coefficients in Table 3 in models that included it (not
presented).
2013]
539
ARE BALLOT TITLES BIASED?
Table 3: Linear Probability Model of Factors Affecting
Whether Bias Was Perceived (Not Direction of Bias)112
At least one Prejudice
type of bias
Argument
Selective
Other
Probability of perceiving bias
0.739
0.353
0.326
0.576
0.291
Strong yes/no preference
0.019
0.018
0.027
0.007
0.030
(0.031)
(0.044)
(0.036)
(0.038)
(0.034)
0.006
0.014
0.053*
0.041†
0.023
(0.022)
(0.020)
(0.023)
(0.023)
(0.021)
0.021
0.023
0.030
0.001
0.033
(0.021)
(0.025)
(0.030)
(0.029)
(0.030)
0.001
0.000
0.000
0.000
0.000
SAT score
Minority party ID
ANES trust in gov’t
(0.001)
(0.001)
(0.001)
(0.001)
(0.001)
Direct democracy trust
0.022
0.064†
0.042
0.067
0.001
(0.036)
(0.038)
(0.043)
(0.042)
(0.040)
GSS social trust
0.010
0.012
0.012
0.010
0.014
(0.008)
(0.010)
(0.007)
Measure competitiveness
Measure divisiveness (by party)
0.002
(0.018)
(0.021)
0.008
0.020
(0.022)
N
R2
0.026
771
0.007
(0.020)
771
0.013
0.012
(0.017)
(0.008)
0.006
(0.022)
(0.018)
0.001
0.037
0.017
(0.016)
771
(0.010)
0.018
(0.027)
771
0.020
0.016
(0.025)
771
0.015
Note: Standard errors (in parentheses) are clustered on measure and respondent.
† Significant at p < 0.10; *p < 0.05
We do find support for the agenda-setters-as-partial-winners hypothesis.
Observers who would vote yes are about seven percent less likely to perceive at
least one type of bias than observers who would vote no (see Table 4). This effect
(p=0.068) in the “at least one type of bias” model is driven almost entirely by
perceptions of selective inclusion/exclusion and “other” bias. There is little
relationship between whether one supports a measure based on the legislative
analyst’s description and whether one thinks the label uses prejudicial language or
makes an argument for or against the measure.
112. The outcome variable is a dummy set to one if respondents perceived bias. Models are
run on the subset of respondents who reported a voting preference and their SAT score (219 ballotlabel observations excluded).
540
[Vol. 3:511
UC IRVINE LAW REVIEW
Table 4: Linear Probability Model of Factors
Affecting General Perceptions of Bias113
At least one Prejudice
type of bias
Probability of perceiving bias
Argument
0.739
0.353
0.326
Vote intention (1=yes)
0.065†
0.019
0.036
(0.035)
(0.036)
SAT score
0.005
0.014
(0.022)
0.023
(0.025)
0.001
Minority party ID
ANES trust in gov’t
Selective
Other
0.576
0.291
0.107*
0.030**
(0.033)
(0.042)
(0.039)
0.053*
0.039†
0.025
(0.020)
(0.024)
(0.023)
(0.021)
0.023
0.031
0.002
0.036
(0.025)
(0.029)
(0.028)
(0.030)
0.000
0.000
0.001
0.001
(0.001)
(0.001)
(0.001)
(0.001)
(0.001)
Direct democracy trust
0.015
0.066†
0.045
0.056
0.011
(0.036)
(0.038)
(0.043)
(0.042)
(0.039)
GSS social trust
0.011
0.012
0.013
0.012
0.016
(0.008)
(0.011)
(0.007)
Measure competitiveness
Measure divisiveness (by party)
N
R2
0.004
0.028
0.013
(0.008)
0.001
(0.010)
0.014
(0.018)
(0.021)
(0.017)
(0.022)
(0.018)
0.005
(0.021)
0.021
(0.019)
0.018
(0.016)
0.004
(0.026)
0.031
(0.021)
771
771
771
0.012
771
0.013
0.021
0.027
771
0.030
Note: Standard errors (in parentheses) are clustered on measure and respondent.
† Significant at p < 0.10; *p < 0.05
We find a significant effect of SAT reading comprehension scores on
perceptions of bias—but not the effect we predicted. We thought good readers
would see more selective inclusion or exclusion bias, for the simple reason that
they would learn more than other readers about the measure from the legislative
analyst’s description. However, as Tables 3 and 4 show, strong readers are actually
less likely than others to perceive this form of bias. They are, however, more likely
than others to think the label makes an argument for or against the measure.
The rest of our individual-attribute results suggest that the veil of ignorance
worked.114 The various trust and minority-party variables have little if any effect
113. The outcome variable is a dummy set to one if respondents perceived bias. Models are
run on the subset of respondents who reported a voting preference and their SAT score (219 ballotlabel observations excluded).
114. We asked a small subset of our survey respondents (n=46) to identify whom they
perceived most likely to have written the measures they evaluated. Just four of the respondents said
2013]
ARE BALLOT TITLES BIASED?
541
on perceived bias. Trust in direct democracy shows the expected negative
relationship to perceptions of selective inclusion/exclusion bias, but not to other
forms of perceived bias. The GSS indicator of social trust has a slight negative
relationship with overall perceived bias, but the effect is trivial. Results using the
Glaeser measure of social trust (not reported) are essentially the same.
What about the political covariates of perceived bias? We do not find much
support for our strategic partisanship hypotheses. Ballot measure competitiveness
is, on balance, positively related to the probability of perceiving bias, though the
effect is not statistically significant. Partisan divisiveness has a slightly negative
effect on the probability of perceiving some forms of bias, and a stronger, positive
effect on the probability of perceiving other forms of bias. In the aggregated, “at
least one type of bias” model—which in our view is the most relevant model—the
sign of the coefficient on divisiveness (by party) is positive but the magnitude is
very small. Because our measure of polarization is normalized (meaning that each
value is recoded as its distance from the group average), the interpretation of the
“divisiveness (by party)” coefficient in Tables 3 and 4 is the effect on perceived
bias between measures with an average divisiveness score (when the variable
equals zero) and measures with a divisiveness score that is one standard deviation
above the average (when the variable equals one). In other words, the coefficient
of 0.008 in Table 3 and 0.005 in Table 4 represents the increased probability (less
than one percent) that a survey respondent perceived bias on measures that were
one standard deviation above the mean, relative to the mean.
In a normal distribution, about sixty-eight percent of the data fall within one
standard deviation of the mean. Thus, in a linear probability model with
normalized independent variables, the effect of shifting an independent variable
from one standard deviation below the mean from one standard deviation
above—equivalent to a shift from the sixteenth percentile to the eighty-fourth
percentile—is twice the coefficient. We therefore estimate that the effect of
shifting partisan divisiveness from the sixteenth percentile to the eighty-fourth
percentile on the probability of perceiving bias is about 1.0% (Table 4) or 1.6%
(Table 3).115 Taking account of the standard errors, we can with ninety-five
percent confidence rule out an effect of this sixteenth-to-eighty-fourth-percentile
shift larger than 5.8% (Table 3) or 5.2% (Table 4).116 Given that our subjects
perceived some form of bias in nearly seventy-four percent of the labels, we can
be quite confident that a measure’s divisiveness has very little substantive effect
they thought an elected partisan official wrote the ballot language, compared to seventeen who
thought the language was written by the initiative’s proponents, and to thirteen who thought the
authors were trained or nonpartisan professionals.
115. All effects described in percentage terms in the text are actually percentage point shifts. To
illustrate, an effect of “five percent” is an increase in the probability of the dependent variable being
equal to one of five percentage points.
116. These effects are calculated as: 2 × (coefficient) + 2 × (standard error).
542
[Vol. 3:511
UC IRVINE LAW REVIEW
on perceived bias in the label. Bias is perceived in the vast majority of cases and at
most is only a couple of percentage points more likely to be perceived with respect
to measures that are much more divisive of partisans than average.
Even if observers actually perceive bias at meaningfully higher rates on
highly divisive measures—which is unlikely, given our results—there are at least
two mechanisms that could explain the effect. First, observers may be detecting
actual bias resulting from AG strategy. Alternatively, observers may have a
tendency to project bias when evaluating labels for ballot measures they consider
controversial or highly partisan.
Table 5: Linear Probability Model of Factors Affecting
Perceptions of Bias Toward a Yes Vote117
Aggregate
Probability of bias toward “yes” vote
Prejudice Argument
0.703
0.676
0.795
Vote intention (1=yes)
0.044
0.047
0.011
Measure conservativeness
0.033
(0.049)
AG party
AG party * measure conservativeness
N
R2
(0.058)
0.064
(0.057)
0.064
Selective
0.702
0.102*
Other
0.748
0.032
(0.043)
(0.071)
0.005
0.044
(0.029)
(0.044)
(0.040)
(0.026)
(0.047)
0.049
0.084
0.035
0.033
0.060
(0.047)
(0.062)
(0.051)
(0.050)
(0.069)
0.029
0.082
0.045
0.013
0.057
(0.044)
(0.060)
(0.048)
(0.048)
(0.075)
666
0.007
369
0.021
333
0.015
582
0.015
309
0.012
Note: Standard errors (in parentheses) are clustered on measure and respondent.
*p < 0.05
We could adjudicate between these hypotheses by looking at the direction of
perceived bias. Though observers may have a tendency to project bias when they
encounter controversial measures, they have no reason (controlling for their vote
intention) to project bias in favor of a yes vote or in favor of a no vote unless they know
the party or ideology of the label’s author. Thus, if observers behind the veil
discern a directional bias that accords with the political preferences of the AG
who authored the label, this would be pretty good evidence of actual, strategic
bias.
117. The outcome variable is a dummy where, conditional on perceiving a particular type of
bias, respondents perceived bias in a yes direction. The first model (“aggregate”) excludes eighty
ballot-label observations where the respondent perceived bias in a yes direction for one type of bias
(e.g., prejudice) and in a no direction for another type of bias (e.g., argumentative).
2013]
543
ARE BALLOT TITLES BIASED?
In the interest of completeness, we report directional bias findings in Tables
5 and 6. (We recognize that it would be quite odd to find directional bias
consistent with our strategic partisanship hypothesis, given the lack of a
meaningful positive correlation between measures’ divisiveness by party and the
probability of labels being perceived as biased.)
Table 6: Linear Probability Model of Factors Affecting
Perceptions of Bias Toward a Yes Vote118
Aggregate Prejudice Argument Selective
Probability of bias toward “yes” vote
Vote intention (1=yes)
Utah Dem.
CA Repub.
0.703
0.676
0.795
0.034
0.049
0.017
Utah Dem. * measure conservativeness
CA Repub. * measure conservativeness
N
R2
0.748
0.089† 0.020
(0.052)
(0.063)
(0.056)
(0.046)
(0.073)
0.028
0.169
0.099
0.136
0.207
(0.139)
(0.165)
(0.130)
0.075†
0.032
(0.133)
(0.157)
0.015
0.009
(0.040)
Measure conservativeness
0.702
Other
0.104**
(0.064)
0.074
0.013
(0.055)
0.035
(0.044)
0.078
(0.083)
0.116
(0.038)
(0.066)
(0.060)
(0.050)
(0.073)
0.169†
0.073
0.114
0.115
0.190
(0.980)
(0.168)
(0.135)
(0.138)
0.118)
0.103**
0.060
0.105*
0.118†
(0.039)
(0.070)
(0.050)
(0.070)
666
0.011
369
0.021
0.018
(0.063)
333
0.017
582
0.024
309
0.017
Note: Standard errors (in parentheses) are clustered on measure and respondent.
† Significant at p < 0.10; *p < 0.05; **p < 0.01
The directional models are limited to cases in which the respondent found
one or more forms of bias in the label. The dependent variable is whether, in the
respondent’s judgment, the bias she discerned favors a yes vote. In a surprising
number of cases (eight percent total), respondents found that a given label was
biased toward a yes vote in some ways (e.g., making an argument) and toward a no
vote in other ways (e.g., selectively including or excluding information). We
exclude these observations from the model in which the dependent variable
118. The outcome variable is a dummy where, conditional on perceiving a particular type of
bias, respondents perceived bias in a Yes direction. The first model (“aggregate”) excludes eighty
ballot-label observations where the respondent perceived bias in a yes direction for one type of bias
(e.g., prejudice) and in a no direction for another type of bias (e.g., argumentative).
544
UC IRVINE LAW REVIEW
[Vol. 3:511
aggregates all four types of bias into a single dummy,119 but we include them
otherwise.120
The independent variables in Table 5 include a vote-intention dummy, the
measure’s conservativeness, the attorney general’s party, and an interaction of AG
party and ballot-measure conservativeness. The AG variable is coded –1 for
Democrats and +1 for Republicans. Thus, AG interacted with “measure
conservativeness” will be positive when the AG’s party supports the measure and
negative when the AG’s party opposes it.
As Table 5 shows, the coefficient on the interaction term is positively signed
in all models—consistent with our strategic partisanship hypothesis—but small in
magnitude and not even close to statistically significant. To get a sense of the
substantive import of the point estimate, it is useful to compare the effect of
shifting “measure conservativeness” from one standard deviation below the mean
(a very liberal measure) to one standard deviation above (a very conservative
measure) under a Democratic AG, with the effect of such a shift under a
Republican AG.121 The following discussion reports effects in the “aggregate”
model, which we consider most informative.122
Measure conservativeness affects the probability of perceiving bias toward a
yes vote both directly and through the interaction with AG party. The direct effect
of increasing measure conservativeness on perceived bias toward a yes vote is
negative and by construction the same regardless of AG party. Under a
Republican AG, the direct and interaction effects basically cancel each other out;
the net effect of shifting conservativeness from one standard deviation below the
mean to one standard deviation above is a reduction in the probability of
perceived bias toward a yes vote of about 0.7%.123 Under a Democratic AG, the
direct and interaction effects reinforce each other; the net effect is a reduction in
perceived bias toward a yes vote of about 12.4%. The difference between the effect
of the í1 to +1 standard deviation shift in measure conservativeness, under
119. In the “aggregate” model, the dependent variable is equal to one if the observer classified
the measure as biased in one or more category(ies) and if that bias was perceived to favor a yes vote.
120. We do not have strong views on whether the “bias in both directions” cases should be
included in the models in which the dependent variable is the direction of a particular kind of bias,
rather than the direction of bias generally. We ran these models both ways (i.e., including or excluding
the bias-in-both-directions cases) and the results are virtually identical.
121. Because the independent variables have been normalized, this is roughly equivalent to
shifting measure conservativeness from the sixteenth percentile to the eighty-fourth percentile. See
supra text accompanying notes 105–06 for further explanation.
122. We have no a priori reason to expect strategic AGs to use one form of bias rather than
another. The aggregate model accounts for all forms of bias, and excludes arguably suspect
observations where bias of one type was seen to favor a yes vote and bias of another type pointed
toward a no vote.
123. As explained earlier, the effect of shifting a normalized, nondichotomous independent
variable in a linear probability model from its mean to one standard deviation above the mean is equal
to the coefficient. The effect of shifting the variable from one standard deviation below to one
standard deviation above is therefore 2 × (coefficient).
2013]
ARE BALLOT TITLES BIASED?
545
Republican and Democratic AGs, on the probability of perceiving bias toward a
yes vote, is about 11.6%. However, this point estimate is not precise. We can only
rule out, with ninety-five percent confidence, differences of more than forty-six
percent.124 An effect of forty-six percent would seem to us quite worrisome,
especially given our initial expectation that individual observers’ bias judgments
would be quite noisy. Thus, while the fairly small and statistically insignificant
coefficient on the interaction term between AG party and measure
conservativeness means that we cannot confirm our strategic partisanship
hypothesis,125 neither can we rule it out.
124. A straightforward way to calculate the largest plausible difference (with ninety-five percent
confidence) between a shift in ballot measure conservativeness from í1 to +1 (i.e., one standard
deviation below the mean to one standard deviation above) under Republican and Democratic AGs
on the probability of perceived bias toward a yes vote, is to set the coefficient on the interaction term
to the upper bound of the ninety-five perecent confidence interval and then to multiply this “upper
bound” estimate by four. This yields an estimate of 46.8%. (One must multiply by four rather than
two, because the effect of increasing a measure’s conservativeness changes signs between Democratic
and Republican AGs, and we’re interested in the difference in the direction of perceived bias on
Democrat and Republican-authored labels. Note that the direct effect of conservativeness on the
direction of perceived bias cancels out when one is looking at the difference between the effect of a
shift in measure conservativeness under Democratic and Republican AGs, so the direct effect, and
uncertainty in the estimate of the direct effect, can be ignored when one is asking about the
differential effect of a shift in ballot measure conservativeness on the direction of perceived bias
under Democratic and Republican AGs.)
Another way to calculate the confidence interval on the difference is by bootstrapping. We ran
the “Aggregate” model in Table 5 on 10,000 bootstrap replications of the data, and calculated the 2.5
and 97.5 percentile values of the estimated difference between the effect of shifting ballot-measure
conservativeness from í1 to +1 under Democratic and Republican AGs. This yielded an upper
bound estimate 39.6%, a bit smaller than the difference calculated directly from the estimated
coefficients. (The difference may be due to the model’s clustering of standard errors.)
125. With a larger sample size we could measure the effect size with more precision, but even
if our findings were statistically significant it is not clear that our point estimate of the interaction
term would be substantively worrisome from a public policy perspective.
Our sample size prevents us from precisely measuring the effect sizes we observe. Using power
analysis we can estimate the sample size necessary to either reject the null hypothesis that Attorneys
General do not engage in strategic behavior, or confirm the null hypothesis, or both. Given a sample
size of 666 (see Table 5), we should be able to estimate a statistically significant difference between
Democratic and Republican AGs if respondents perceived bias at least eleven percent more often
when reading labels written by one or the other. There are 358 observations of Democratic-authored
ballot labels and 308 observations for Republican-authored ballot labels. The standard error of the
difference is b = sqrt(p1(1-p1)/(0.54n) + p2(1-p2)/(0.46n)) with an upper bound of 0.5 × b. See
ANDREW GELMAN & JENNIFER HILL, DATA ANALYSIS USING REGRESSION AND
MULTILEVEL/HIERARCHICAL MODELS 439–47 (2007). Using the conventional power level of 80%
(meaning 80% of the 95% confidence intervals will not overlap 0.5), we can estimate the effect size x
by plugging in 666 to n in the equation above, which simplifies to 666 = (2.8 / x) × 2 or x = 0.11.
The observed effect on the AG party dummy is just 4.5%. In order for us to statistically
significantly estimate an effect of that size we would need a sample size of 3900. We are also
interested in the interaction of AG and polarization. In order for the observed interaction effect size
of 13.3% to be statistically significant, we would need a sample size of 5128. Because standard errors
are proportional to 1/sqrt(n) and for 80% power the estimated effect must be 2.8 standard deviations
from 0, then the standard error must be at least 0.133 / 2.8 = 0.048. Thus we would need a sample
size (0.133 / 0.048) × 2 times as large as the current sample size. 666 × 7.7 = 5128.
546
UC IRVINE LAW REVIEW
[Vol. 3:511
As for the individual-level covariates, we see some limited evidence for our
hypothesis that subjects project bias against their personal position. The
coefficients on the “vote yes” dummy are negative in nearly all of the models, but
they reach statistical significance only when the dependent variable concerns
selective inclusion or exclusion bias (the magnitude of the effect is about ten
percentage points). This is further evidence that personal policy preferences
particularly distort selective inclusion or exclusion bias.126
We find strangely mixed evidence for our minority-party identification
hypothesis. Positive coefficients on the interaction term between Utah Democrat
and conservativeness and negative coefficients on the interaction terms between
California Republican and conservativeness would corroborate our hypothesis that
minority-party identifiers see bias in the direction of the majority party’s ideology.
For Utah Democrats, the sign on the coefficient is positive in most models and
becomes marginally significant in the aggregate model (p = 0.09). For California
Republicans, however, the sign is also positive and highly significant (p < 0.01).
The magnitude of this effect is captured by the sum of the interaction and the
coefficient on “measure conservativeness.” In the “Aggregate” model of Table 6,
the effect for Utah Democrats is 6.5% with a 95% confidence interval of –14% to
29%. The effect for California Republicans is 0.2% with a 95% confidence
interval of 4% to 4%. In other words, we observe some evidence that Utah
Democrats were more likely to perceive bias in favor of a yes vote on conservative
measures, but no evidence that California Republicans perceived a bias towards
yes on liberal measures.
III. DISCUSSION AND CONCLUSION
As readers of the daily paper, we began this project with a dim view of
California AGs’ performance in writing ballot labels. If asked to vote on a
proposition to transfer authority over ballot-measure communications from the
AG to the LAO, we would have certainly voted yes.127 But the results reported
here give us pause.128
126. See supra Tables 3, 4.
127. In previous work, one of us has argued for new independent checks on the
administration of elections. See Christopher S. Elmendorf, Election Commission and Electoral Reform: An
Overview, 5 ELECTION L.J. 425 (2006); Christopher S. Elmendorf, Representation Reinforcement Through
Advisory Commissions: The Case of Election Law, 80 N.Y.U. L. REV. 1366 (2005).
128. It is certainly possible that the nonpartisan legislative analyst would do a better, more
impartial job of labeling ballot measures than the partisan attorney general. But this is far from
obvious; our results provide little if any evidence that the AG has behaved as a strategic partisan in
labeling ballot measures. It is also possible that transferring ballot-label responsibilities to the
legislative analyst would make it harder for him or her to maintain the trust and confidence of
legislative leaders on both sides of the aisle, given the litigation and casting of aspersions that seems
almost inevitably to attend the labeling of ballot measures. An analysis of this trade-off is outside the
scope of this Article.
2013]
ARE BALLOT TITLES BIASED?
547
On balance, we found little evidence to support our hypotheses about
strategic manipulation of ballot-label complexity. To be sure, our results do
suggest that Republican AGs may have tried to confuse or mislead LRL voters by
writing difficult labels on conservative, competitive measures. But this result
should be treated very cautiously, since the same pattern does not occur on very
liberal measures, since there are only two Republican AGs in our sample, and
since there is no evidence that Democratic AGs have tried to manipulate ballotlabel complexity. Indeed, contrary to our expectations, Democratic AGs wrote
harder-to-read labels than Republican AGs. On balance, we think the results of
our ballot measure readability models are strong enough to motivate further
investigation of readability manipulation in other states—particularly in states
whose ballot labels exhibit lots of variation in readability—but not strong enough
to impugn Republican AGs in California.
We found essentially no association between the competitiveness of a ballot
proposition and veiled observers’ perceptions of bias. If AGs were behaving as
strategic actors, it is on competitive measures where one would expect to find the
most bias.
Similarly, the effects of ballot-measure divisiveness (by party) on the
probability of perceiving bias were small and statistically insignificant. We can be
quite certain that the labels of very divisive measures (one standard deviation
above the mean) are no more than 5.8% more likely to be perceived as biased than
the labels of very nondivisive measures (one standard deviation below the mean).
And even if the effect were this large—which is very unlikely—it would not
necessarily imply that AGs have been writing biased labels on measures that divide
the electorate on party lines. It is also possible that observers are predisposed to
“see” (i.e., project) ballot-label bias whenever they recognize the subject of a
measure as one over which Democrats and Republicans clash.
This projection effect should not, however, cause veiled observers to
perceive a directional bias that favors the party of the AG. As such, the strongest
indicator of strategic partisanship would be positive, statistically significant
coefficients on the interaction terms between AG party and ballot-measure
conservativeness in the directional models. We obtained small and statistically
insignificant estimates for these coefficients. But because the standard errors are
fairly large we cannot rule out a substantively important effect.
This illustrates a more general point: our Article does not prove that AGs are
“not biased.” Our measures of partisan divisiveness, competiveness, and
redistributiveness are imperfect. Our observers may not have taken their task very
seriously. Our sample size does not leave us with much statistical power to detect
small effects. It may be that the AG is biased but only on a very small number of
extremely controversial measures, which are not well captured by our divisiveness
scores. Or it may be that the AG is biased on measures that would have big
financial consequences for well-organized interests but do not excite and divide
548
UC IRVINE LAW REVIEW
[Vol. 3:511
partisans. Or it may be that for any given ballot measure there is a range of
plausible labels that nonetheless have different consequences for how people will
vote, and that AGs use their discretion to pick from these defensible labels the
alternative that best serves their party’s positions.
It is an open question whether the responses we elicited from our subjects
are just noise or whether in the aggregate they contain meaningful information
about which labels are more or less biased. The best evidence of meaningful
information would be positive, statistically significant coefficients on the political
variables that, we think, strengthen or weaken the incentive for the AG to write a
biased label. But we’ve obtained no such result. In results not reported here, we
confirmed that the distribution of bias scores across all measures in our study is
extremely unlikely to have occurred by chance if all subjects had the same
probability of perceiving bias and the probability of perceiving bias was unrelated
to the measure or label at issue. But even if bias judgments are no better than the
flip of a weighted coin, there is no reason to expect the weighting of the coin to be
the same across observers.129
One final point is worth reiterating. We do find that individuals’ support for
or opposition to a measure based on the LAO’s description affects their
probability of perceiving bias. This effect is fairly large and statistically significant
in the case of selective inclusion or exclusion bias. It is inconsequential in the case
of argument and prejudicial-language bias. Interestingly, when California courts
review ballot labels for bias, they generally have focused on whether the label
makes an argument or uses prejudicial language rather than on whether the label is
misleadingly selective.130 The courts’ focus, which seems to us fairly arbitrary, is
perhaps explained by an intuition that judges’ policy preferences are more likely to
contaminate their assessments of the latter than the former.131
129. In future work, we may create a rank ordering of California’s post-1974 ballot
propositions by perceived bias. We would purge bias observations of the effects of individual-level
covariates (vote intention, SAT score), and then aggregate the purged scores into a directional,
measure-level estimate of bias. We could then randomly assign experimental subjects to read either
the ballot label or the legislative analyst’s long description of the measure, and afterwards ask subjects
about their vote intention. If our aggregated measure of perceived bias is meaningful, respondents
who are assigned to read biased labels should vote for or against the measure (depending on the
direction of bias) at higher rates than respondents who read only the legislative analyst’s description,
whereas respondents who receive neutral labels should vote similarly to respondents in the legislative
analyst group.
130. We base this claim on our reading of the small number of published opinions. See, e.g.,
supra notes 17–20 and accompanying text.
131. That said, the courts in reviewing the AG’s “title and summary” for the ballot
pamphlet do evaluate whether the AG fairly conveyed the “chief purposes and points” of the
ballot measure. See Yes on 25, Citizens for an On-Time Budget v. Superior Court, 189 Cal. App.
4th 1445, 1452 (2010); Lungren v. Superior Court, 48 Cal. App. 4th 435, 439–440 (1996). This is
tantamount to asking whether the AG in writing the title and summary selectively included or
excluded information in a manner calculated to induce a yes or no vote.
2013]
549
ARE BALLOT TITLES BIASED?
Appendix A:
Summary Statistics of Variables Used in LPMs
INDIVIDUAL-LEVEL COVARIATES
Variable
University
Self-reported party ID
SAT (Critical Reading) / ACT
(Reading) score
N
Values
UC Berkeley
BYU
Strong Dem.
Dem.
Leans Dem.
Independent
Leans Repub.
Repub.
Strong Repub.
255
77
57
56
73
57
48
24
16
%
77
23
17
17
22
17
15
7
5
700–800 / 31–36
650–699 / 28–30
600–649 / 25–27
550–599 / 21–24
Below 550 / Below 21
Don’t know
148
66
36
8
2
70
45
20
11
2
1
21
POLITICAL COVARIATES
Variable
Min
Max
1 (D)
1 (R)
11
83
1
2
GSS social trust
4.24
4.02
Glaeser social trust
6.40
3.37
Competitiveness
0.62
0.02
0.01
0.39
0.31
0.39
Attorney general
Trust in government (ANES)
Ballot trust
Measure divisiveness (by party)
Conservativeness
Average
(SD)
0.10
(1.00)
42.48
(21.88)
1.44
(0.50)
0.08
(2.20)
0.04
(2.06)
0.21
(0.13)
0.20
(0.10)
0.03
(0.22)