Confounding Modernisation in sub-Saharan Africa?

Confounding Modernisation in
sub-Saharan Africa?
Florian Reiche∗
Department of Politics, University of Sheffield
To my grandfather
In 2007 Botswana had a per capita Income of US$ 9,404 and is widely regarded as a stable
democracy1 . Meanwhile, The Gambia had a per capita income of US$ 1,414 and is a fullyfledged dictatorship. Scholars of modernisation theory would find nothing surprising in
these figures, at all. Their theoretical framework posits two main hypotheses which are
perfectly in line with the characteristics of these two countries: First, countries are more
likely to become democracies, as they develop economically. Secondly, as Lipset put it,
“the more well-to-do a nation, the greater the chances that it will sustain democracy.”
(Lipset, 1959, p. 75). While The Gambia has a low level of development and is a
dictatorship, material well-being is significantly higher in Botswana which is “therefore”
a democracy. Thus, evidence seems to suggest that modernisation theory holds in subSaharan Africa (SSA).
∗
I am highly indebted to Dr Jouni Kuha, Dr Alistair McMillan and Professor Graham Harrison for
their help and support in setting this paper up.
1
Data on Gross Domestic Product (GDP) is taken from Heston et al. (2009). The argument on the
regime type in this introductory section is based on the overall Polity IV rating (see Marshall and
Jaggers, 2010), with countries in the range from -10 to 0 being coded as dictatorships and countries
between 0 and 10 being coded as democracies. Later, the regime coding of Przeworski et al. (2000)
is chosen.
1
But what about Benin? Benin, a country which is regarded as a democracy, had a per
capita income of US$ 1,412 – roughly the same as The Gambia which is a dictatorship. At
the same time, Swaziland had a per capita income of US$ 7,299 – bringing it nearly on par
with Botswana – and is a dictatorship. So which countries are now the exceptions? Does
modernisation hold in SSA more generally, and so Benin is just more democratic than it
is supposed to be theoretically? Is Swaziland just “unlucky”? Or does modernisation have
little explanatory power in SSA, and Botswana and The Gambia fit the theory entirely
by chance? These questions can only be answered in an analysis which is designed to
test for an empirical generalisation of modernisation theory in the region of SSA.
Modernisation theory does not really exist. The term “modernisation” was initially
used to describe a “process of social change whereby less developed societies acquire
characteristics common to more developed societies” (Lerner, 1973, p. 386 cited in
Payne and Phillips, 2010, p. 63). It takes its roots in the advent of the Cold War and
the leading role the United States of America (USA) assumed in the emerging new world
order. The USA took an interest in turning the new developing countries into societies
reflecting the western ideal of thought and to steer them away from communism (see
So, 1990, p. 36). From the early 1950s onwards, modernisation theory spread into a
broad variety of different sub-fields, “based in economics, psychology, political science
and geography”(Payne and Phillips, 2010, p. 65), so that Payne and Phillips find it
more appropriate to speak about modernisation theories. In line with the initial efforts
by the USA, modernisation theorists in political science see a change of the political
regime to be inevitable as a society is moving from the “traditional” to the “modern”
state. “While dictatorships might be sustainable in immature societies, this is no longer
the case in mature societies once they develop economically.”(Clark et al., 2008) The
rationale behind this argument is summed up well by Przeworski et al.:
[As] a country develops, its social structure becomes complex, new groups
emerge and organize, labor processes require the active cooperation of employees, and, as a result, the system can no longer be effectively run by command: The society is too complex, technological change endows the direct
producers with autonomy and private information, civil society emerges, and
dictatorial forms of control lose their effectiveness. Various groups, whether
the bourgeoisie, workers, or just the amorphous “civil society,” rise against
the dictatorial regime, and it falls. (Przeworski et al., 2000, p. 88)
2
In order to research this relationship properly, a quantitative analysis needs to draw on
a reliable and rectangular data set. The notion that the type and the quality of data
plays a crucial role in quantitative studies on modernisation – and of course also more
generally – is not new. For example Arat argues for a new analysis of the validity of
modernisation in the light of “improved data and measurement.” (Arat, 1988, p. 23) But
what if there is little or no data? Scholars are aware of this issue, as for example many a
debate over the availability of data on inequality has shown (see for example Przeworski
et al., 2000, p. 117). Unfortunately, the implications of this issue are sometimes not fully
taken into account when researching modernisation in a quantitative framework. This is
especially true in the context of SSA – and without sufficient data coverage, the empirical
generalisation called for above cannot be made. In that case we will never know whether
Benin should really be an autocracy, and Swaziland should be a democracy given their
respective level of per capita GDP.
The main problem of missing data is that most statistical packages use listwise deletion.
This means that as soon as a value for one single variable is missing, the complete
observation is deleted and that the available data of all other variables is lost. Not
only are large amounts of valuable data discarded in this process, but the number of
observations may drop so harshly that results become seriously flawed and questionable.
In any case, “inferences from analyses using listwise deletion are relatively inefficient, no
matter which assumption characterizes the missingness, and they are also biased, unless
MCAR [Missing Completely At Random] holds.” (King et al., 2001, p. 51) The focus of
this paper is therefore threefold: First, it is going to discuss the implications missing data
has for the study of modernisation theory in SSA2 . It will show that research focusing
on existing data can draw no conclusions on whether modernisation holds in this region,
as the data is reduced to a marginal fraction in the computational process. Secondly, it
proposes a solution to this problem, largely drawing on the software developed by James
Honaker, Gary King and Metthew Blackwell (see Honaker et al., 2011). Lastly, it will
use these data to provide an improved and more robust analysis of modernisation theory
in SSA.
2
Unless otherwise stated the argument henceforth only refers to the quantitative strand of literature.
3
Data? Which Data?!
Data? Which Data?!
It is an open secret that data on socio-economic indicators in SSA is sparse and a quick
look at existing data sets shows that this is indeed the case. The data set used here is
compiled from a variety of sources3 to maximise coverage, but even then some variables
exhibit a degree of missingness well above 50% (see table 1 for details). The variables
chosen to operationalise economic development are by no means exotic, but as table 1
shows are basic socio-economic indicators, spanning from 1950 or the country’s respective
year of independence untill 2002. Figure 1 visualises the degree of missingness for each
variable. All variables used are shown on the x-axis (see table 1 for labels) and are sorted
by descending level of missingness. The 44 countries (cross sections) are placed on the
y-axis.
Given these figures it does not come as a surprise that 1628 observations out of 1742 are
deleted when a simple probit (rgdpch, infant, primsch, urban, odac, pop, fuelx, alesinae,
alesinal and alesinar ) is run for African4 countries only. This is equivalent to 93.5% of
the original data. Even if infant which has the highest fraction of missing data is taken
out of the model and replaced by life which is regarded to be inferior to infant mortality
as an indicator for health (see Todaro and Smith, 2006, p. 392), still only 295 out of
1742 observations remain in the model which corresponds to 16.93%.
How is the problem dealt with so far?
Theoretically, economic development is a multifaceted phenomenon which necessitates
a comprehensive operationalisation. So the empirical given, that coverage for per capita
GDP is nearly complete, should not lure a researcher into reducing development into a
single, monetary indicator. Even if for example Przeworski et al. find that per capita
GDP “can best predict the incidences of various political regimes” (Przeworski et al.,
2000, p. 83) and that other variables add little explanatory value to their basic model,
theoretical justifications should not be cast aside by statistical rigor in subsequent studies. In their initial model for a global study of modernisation, the authors have only
3
4
A full overview of sources is provided in Appendix I.
Africa and SSA are used interchangeably.
4
How is the problem dealt with so far?
Variable
Label
Fraction
Missing
countryname
Name of the Country
0.000
countrycode
Own Country Code
0.000
year
Year of observation
0.000
reg
Regime type according to Przeworski et al.
0.000
reglag
Regime type, lagged by one year
0.000
land
Land area (sq. km)
0.000
pop
Population (in 1,000)
0.000
colony
Colonial Background
0.000
imfcode
Country Code by IMF
0.020
urban
Urban population (% of total population)
0.022
alesinae
Ethnic fractionalisation
0.025
alesinar
Religious fractionalisation
0.025
rgdpch
Real GDP per capita (2005 constant prices)
0.027
life
Life expectancy at birth
0.032
alesinal
Linguistic fractionalisation
0.048
odac
Official development assistance and official
aid (2005 constant prices)
0.062
agri
Average share of agriculture over GDP
0.557
yrtsch
Average years of schooling
0.604
fuelx
Fuel exports (% of merchandise exports)
0.620
primsch
Primary school enrollment (gross)
0.649
conc
Index of export concentration
0.678
infant
Mortality rate, infant (per 1,000 live births)
0.807
Table 1: Degree of Missingness per Variable
5
How is the problem dealt with so far?
Figure 1: Missingness Map
6
How is the problem dealt with so far?
looked at a few other, time-invariant covariates (see for example Przeworski et al., 2000,
p. 81), but extend their scope considerably in the course of the analysis, also using for
example death, birth and infant mortality rates (see Przeworski et al., 2000, pp. 226230). As shown, the coverage of infant mortality rates for SSA is extremely low5 , so that
this region is largely excluded from the global arguments made by them (see further below for more on this argument). A similar setup is also chosen by Boix. Yet, his inquiry
into modernisation places less importance on per capita GDP, but is primarily interested
in income inequality where coverage for SSA is particularly low in the Deininger and
Squire set used (Deininger and Squire, 1996, p. 573). It was indeed so low6 , that it was
excluded from considerations in this paper. Boix’s model also includes average years
of schooling (yrtsch) (see Boix, 2003, pp. 79-81) for which coverage in SSA is a mere
39.5% in his data set. Originally, this data set stems from the 1993 version of Barro and
Lee, 2000a that provides data on average years of schooling in 5-year intervals. Boix fills
the four missing data points between any two observations by carrying the respective
observed value forward until the next data point is reached. He thus assumes the level
to remain the same for the period between observed values – it will be shown below that
this is not an adequate statistical solution to the problem of missing data. This obscures
the true degree of missingness which is as high as 89% and casts further doubt on the
conclusions drawn from the analysis of the data.7 8
Now, these studies do not endeavour to provide insights on modernisation in SSA in
particular, but any global study includes the countries of SSA. There are 193 countries
in the world9 , out of which 44 are in SSA. This equates to roughly about 20% of all
countries. Listwise deletion in turn would discard about 90% of the data for SSA,
so that these countries virtually disappear in any of these analyses. As this region
5
Przeworski et al. use World Bank Indicators from 1994 which is a predecessor of the data set used
here, therefore a comparison is viable.
6
Within the high-quality data set there are 51 observations for SSA in the period under scrutiny
here. This corresponds to a degree of missingness of 97.08%. Note that Atkinson and Brandolini
strongly discourage researchers from limiting themselves to the “accept” series as a wider focus helps
to “eliminate the most obvious inconsistencies” (Atkinson and Brandolini, 2001, p. 790), but for the
context of the argument made here it gives a good indication of the overall data situation.
7
Also note, that the measure used in this context is average years of schooling for the population below
the age of 25. For developing countries, however, average years of schoolong for the population below
the level of 15 is, as Barro and Lee argue, a better indicator (see Barro and Lee, 2000b, p. 2). These
data were not available to Boix at the time of his publication.
8
To increase a priori coverage, yrtsch is replaced by primsch (gross enrollment rate in primary schooling) to create the final data set used for the analysis.
9
There are 193 members in the United Nations (UN), the actual number of countries in the world is
slightly higher as not all are members of the UN.
7
How is the problem dealt with so far?
contains one of the poorest countries in the world, such an analytical setup is biased
towards richer countries and is therefore likely to have highly spoiled results. What
becomes evident from this short discussion then, is that missing data for SSA seriously
affects global studies and is likely to lead into even greater problems when the focus
turns to SSA more specifically. In case of the latter, if researchers of modernisation are
interested in the dynamics in SSA, a common approach seems to be to include a dummy
variable in the analysis. But what does this dummy actually tell us in the light of the
foregone discussion about missing data and the listwise deletion of incomplete cases this
problem entails? The answer is: very little. The inclusion of a dummy for SSA in these
studies (see for example the work by Helliwell10 or by Alesina et al.11 ) throws up yet
more issues why the analysis is likely to be flawed: in a highly heterogeneous sample
the yearly variation in independent variables in less developed countries is likely to be
marginalised by substantially larger yearly changes in developed countries. Also the
nature of a dummy is very reductionist, so we cannot learn anything about the influence
of individual variables in the African context, only whether the overall relationship holds.
And this in turn is questionable because of the data situation.
In a nutshell, the current availability of data is likely to induce bias in global studies
and certainly does not allow for a robust test of modernisation theory in SSA. So data
coverage must be improved – but how? Naturally, it would be desirable to collect more
data, but especially in a time-series context any hope to do so is rather naive. So if we
are still interested in a macro-quantitative perspective on modernisation in SSA, there
is only one option: the data needs to be imputed. There are various ways in which to
impute or simulate data. “[Ad-hoc] methods of imputation, such as mean imputation,
can lead to serious biases in variances and covariances” (Honaker et al., 2010, p. 3)
and are therefore undesirable12 . A preferable way is multiple imputation. For more
information on this method of dealing with missing data, see Appendix II. Here the
method is discussed in more detail and information on the setup of the imputation
process for this paper is provided. It sets up 50 rectangular data sets that can be used
for an improved and more robust analysis of modernisation theory in SSA.
10
Helliwell chooses logged per capita GDP and secondary school enrollment rate, measured as a fraction
of adult population, see Helliwell, 1992, pp. 5 and 7. The coverage of secondary school enrollment
remains relatively sparse to this day (see WorldBank, 2010).
11
The authors select growth of per capita GDP and other indicators about the nature of the political
system, such as the occurrence of adjustments in the executive in the previous year (see Alesina
et al., 1996, p. 201)
12
For a comprehensive discussion of the shortcomings of ad-hoc methods, see Schafer, 1997, pp. 1-2.
8
Dynamic Probit
Dynamic Probit
Subscribing to the ingenious theoretical distinction between endogenous and exogenous
democratisation by Przeworski et al. (2000), a dynamic probit model, also known as a
Markov Transition Model, seems an appropriate method to test for these two processes.
As far as the actual setup of the dynamic probit is concerned, a researcher has two
options. One option is to create two models and handle the processes of democratic
emergence and of democratic survival separately. This approach is sensible if it is believed that different covariates are involved in emergence and survival, for example. A
precondition for this setup is, however, the availability of sufficient data in both processes. The alternative is to set a full interaction model up which estimates both of these
steps simultaneously.
Out of the 1742 years observed in the data set at hand, only 223 are democracies. The
calculation of the probability of democratic survival would thus be based on an analysis
of 205 cases. To avoid any worries about the issue of small samples, it has been decided
to estimate a full interaction model and thus to make use of all data in the calculation.13
The estimation is run using the program Clarify (see King et al., 2000) within Stata in
order to combine the 50 data sets in the analysis.
The Formal Setup
Przeworski et al. (2000) have suggested that per capita GDP (rgdpch) is the best indicator of development (see Przeworski et al., 2000, p. 81, footnote 2), but to also take into
account the importance of human capital in the development process, life expectancy
at birth (life) and gross enrollment in primary schooling (primsch) are included in the
model. As SSA is largely agricultural, and the degree of urbanisation can be seen as an
indicator of a more advanced economic structure, the significance of urbanisation (urban) is equally tested for. To assess the effectiveness of official development aid (ODA)
13
It should be noted that it generally makes no difference in a probit world whether a full interaction
model or two subset regressions are estimated: a probit model forces the standard deviation σ 2 = 1
and therefore the estimated parameters of the separate and the full interaction model will be the
same. In case of a continuous dependent variable, the full interaction model would naturally estimate
one single σ 2 , but the subset regressions would have a different σ 2 each – the estimated parameters
from these would therefore differ from each other, too.
9
Dynamic Probit
to bring democracy about, ODA (odac) is also brought into the model. An equally
interesting variable is fuelx as it has often been argued that oil hinders the rospects of
democracy (see for example Ross, 2001). The size of the population (pop) as a proxy for
country size, and measures of ethnic (alesinae) and religious (alesinar ) fractionalisation
as calculated by Alesina et al. (2002) serve as control variables.
Formally, this model can be written as14 :
P (Dit ) = Φ(β0 + β1 ln(rgdpch) + β2 ln(odac) + β3 urban + β4 ln(pop)+
+ β5 primsch + β6 lif e + β7 alesinae + β8 alesinar+
+ β9 f uelx + β10 ldemoc + β11 ID ln(rgdpch) + β12 ID ln(odac)+
(1)
+ β13 ID urban + β14 ID ln(pop) + β15 ID primsch + β16 ID lif e+
+ β17 ID alesinae + β18 ID alesinar)
where P (Dit ) is the probability that a country i is a democracy in year t, Φ(·) is the
cumulative normal distribution and ID is an indicator variable for democracy in the previous period (notation adapted from Epstein et al., 2006, p. 553). If ID = 0, coefficients
β11 to β18 are also equal to 0 and hence coefficients β1 to β10 will deliver the impact the
respective variable has on the probability of democratic emergence. Note that per capita
GDP, ODA payments and the population size have been logged in order to transform
them into an approximately normally distributed density.
The estimates for democratic survival are not equally straightforward. First, each coefficient now corresponds to the sum of two betas, for rgdpch for example the sum of β1 and
β11 15 . Secondly, to test for statistical significance, a Wald test needs to be performed. “A
Wald test is used to determine whether a linear combination of coefficient values is equal
to some constant. Here we wish to test the restriction that, for instance, [β1 + β11 = 0].
See Greene (2003, 484-88). All Wald tests were performed using the postestimation test
command in Stata [10.0].”(Epstein et al., 2006, p.553, footnote 4)16
14
Note that all time-variant independent variables are lagged by one period as the regime type in period
t is modeled on the covariates in period t − 1.
15
For an explanation see Beck et al., 2002, p. 4
16
All estimates were cross checked in the zelig Package in R (see Imai et al., 2007 and Imai et al., 2008),
due to an issue with exceptionally high degrees of freedom in Clarify. The results are approximately
10
Dynamic Probit
Descriptive Statistics
Before performing some more in depth analysis of the data, let us have a look at the data
itself and some interesting patterns within it. Until 1955 there are only three countries
under scrutiny which is the minimum amount of countries at any one time. The number
gradually increases until it reaches its maximum of 44 countries in 1993. This adds up
to a total of 1742 years in the data set of which 223 were spent in democracies and 1519
in autocracies. This corresponds to 12.8% and 87.2% respectively. If all independent
variables are set to their mean, P (democ = 0) = 0.964 and P (democ = 1) = 0.036 which
means that on average a country had a 3.6% chance of being a democracy in any given
year and a 96.4% chance to be an autocracy.
As one of the most widely and frequently used indicators of development, GDP is worth
examining in more detail17 . Figure 2 shows the trend of per capita GDP from 1950
to 2002. Due to the small number of countries under scrutiny between 1950 and 1960
(3) the confidence intervals are huge and therefore obscure an interesting trend after
the largest group of countries had gained independence in 1960 (here the number of
observations jumps from six to 23). Figure 3 excludes the pre-1960 period and has got
much greater analytical purchase.
Figure 3 shows a steady climb of per capita GDP over the years. The trajectory starts
with US$ 1579 in 1961, the lowest yearly average in the period under scrutiny. In 2002,
the average income per capita is US$ 2909 which is 1.84 times as high as in 1961. It
peaks for the first time in 1977 at US$ 2380. The subsequent drop can be attributed to a
steep economic decline in some African countries during this time, “manifest in a rise in
inflation (outside the franc zone) and a drastic fall in output, export revenues, and private
capital flows.” (Sandbrook, 2000, p. 11, see also Allen, 1995, p.312) The second notable
peak occurs in 1990 (US$ 2479). At this stage, many developing country underwent a
growth crisis and experienced high inflation. The ensuing economic downturn can be
seen very clearly in this figure.18
equal however – differences can be attributed to the simulation process.
The following graphs only use data from one imputed data set. As the degree of missingness for per
capita GDP was very low (2.7%), the variation between the data sets would not distort the general
impression the following argument is supposed to convey.
18
This period is explored in more depth in Zagha et al., 2005, especially see p. 94.
17
11
Dynamic Probit
Figure 2: Average of per capita GDP with 95% confidence intervals
Figure 3: Average of per capita GDP after 1960 with 95% confidence intervals
12
Dynamic Probit
Findings
Variable
Democratic
Transition
Democratic
Survival
ln(rgdpch)
-0.3542
(0.033)
-0.3538
(0.076)
ln(odac)
0.1584
(0.077)
-0.3260
(0.062)
urban
1.4877
(0.114)
3.3177
(0.102)
fuelx
0.1722
(0.706)
-1.2250
(0.468)
life
0.0091
(0.643)
0.0349
(0.578)
primsch
0.0008
(0.865)
0.0209
(0.268)
ln(pop)
0.0999
(0.356)
0.1813
(0.505)
alesinae
-0.4660
(0.473)
-0.7280
(0.676)
alesinar
0.2794
(0.585)
-1.4004
(0.429)
constant
-4.1364
(0.031)
5.9023
(0.049)
Table 2: Results of Markov Transition Model for Democratic Transitions and Survival
(p-values in parentheses)
When Przeworski et al. published their analysis of the relationship between development
and democratisation in 2000, modernisation theory was about to be declared dead – per
capita GDP, the stronghold of development indicators, had no bearing on democratic
transitions. It was only in democratic survival where this variable had an impact. However, these results have subsequently been challenged by for example Boix (2003) and
Epstein et al. (2006) – the debate was open once again.
There is evidence to suggest that per capita GDP has explanatory power for democrati-
13
Dynamic Probit
sation in SSA (see table 2). Its coefficient is significant19 , a finding that is at odds with
Przeworski et al. (2000), but perfectly in line with Boix (2003) and Epstein et al. (2006).
Yet, the coefficient is negative. In other words, if an autocracy in SSA gets richer, it is
less likely to collapse. Or to be more precise, if per capita GDP rises by 10% in a given
year, a country in SSA is about 3% less likely to become a democracy in the following
year. More detailed and in-depth analysis is needed to find the reason for this direction,
but two reasons seem very plausible in this context.
The gap hypothesis as proposed by Huntington (Huntington, 1968) posits that as a
country is attaining higher degrees of urbanisation, literacy, education, mass media and
thus the traditional man becomes exposed to new forms of life, “[these] experiences
break the cognitive and attitudinal barriers of the traditional culture and promote new
levels of aspirations and wants. The ability of a transitional society to satisfy these
new aspirations, however, increases much more slowly than the aspirations themselves.”
(Huntington, 1968, pp. 53-54) This gap between aspirations and satisfaction “generates social frustration and dissatisfaction. In practice, the extent of the gap provides
a reasonable index to political instability.” (Huntington, 1968, p. 54) Huntington explains that social frustration triggers demands on the government. These cannot be
articulated in appropriate channels, however, as the country is not yet able to provide
sufficient political institutions in order to cope with the increase of the demand in political participation20 . The result of this imbalance is political instability. Huntington
stresses that countries are particularly prone to these instabilities in the early phases of
modernisation (see Huntington, 1968, p. 56). Scholars generally agree that modernisation is a lengthy process which has taken centuries in western societies. It is expected
to be much shorter in newly developing societies as they can leapfrog certain phases
modeling their development on the example of already modernised countries. Yet, the
process still requires generations and is not a matter of a few years (see Huntington,
1971, p. 289). With its 50 years of independence in the bulk of countries, it is therefore
reasonable to assume that SSA is still in the early phases of modernisation which makes
countries particularly vulnerable to the aspirations-satisfaction gap. This problem is
aggravated by the poor record of institution-building in SSA through which political
participation could be facilitated.
19
This particular coefficient is the only one returned significant at the 95% level. For the remainder of
the analysis a significance level of 90% is accepted.
20
Political participation is seen as the outcome of the ratio between the level of social frustration
and mobility opportunities that exist in the society. In most modernising countries the level of
socio-economic mobility is very low (see Huntington, 1968, pp. 54-55).
14
Dynamic Probit
The second explanation follows on from an argument made by Allen and lies in the
nature of clientelist politics. This form of politics is a result of the decolonialisation
process and can thus be traced back to the very beginning of statehood in SSA. The
colonial powers were initially pursuing a slow strategy for releasing their countries into
independence but were forced into a far more speedy process by internal, nationalist
pressures. As a result, elections were announced with very short notice – parties had
to be built and electoral support had to be secured extraordinarily quickly. There were
two major strategies in operation to achieve this:
a reliance on individuals who already had considerable local followings, and
the use of clientelist (“patronage”) politics to bind local notables to the party
and local voters to the candidates. In essence, voters were offered collective
material benefits (roads, schools, clinics, water etc) for their votes, while candidates and notables were offered individual benefits (cash, access to licenses,
credit or land etc) as well as being portrayed as responsible for the arrival of
the collective benefits. (Allen, 1995, p. 304)
It does not take long to see that a system with this setup is doomed to fail; it triggered
“political decay in the form of conflict and violence, abuse of political and human rights
and corruption.” (Allen, 1995, p. 305) Some countries were able to overcome these
problems by implementing “centralised-bureaucratic” regimes which did retain clientelism, but under the centralised power of a president who controlled the distribution
of clientelist resources, rather than the respective parties themselves. Those countries
which did not manage to transition into this stage, plunged into a state Allen labels
“spoils politics”. Spoils politics is characterised by eight key features: winner takes all
principle, corruption / looting of the economy, economic crises, lack of political mediation, repression and violence, communalism, endemic instability and erosion of authority
(for more details on these see Allen, 1995, pp. 307-309). For the purpose of explaining
the negative influence on per capita GDP on the probability of a country to transition
to democracy, the occurrence of economic crises is particularly interesting. In Allen’s
model, economic crises would lead to regime collapse. Or reversed, when the economy
does well, a stable regime is to be expected. We see in the data that an increase in per
capita GDP leads to a lower probability of a transition to democracy, or in other words,
to the stability of an autocratic regime. And this is perfectly in line with the rationale
15
Dynamic Probit
provided by Allen.21
An explanation that sounds superficially appealing lies in the size distribution of income.
A very notable discussion of the link between inequality and democratisation has been
delivered by Geddes who dissects the different theoretical contributions. Without going
into too much detail, she distinguishes between two general models of how inequality can
bring democracy about. The first looks at the division of a society in terms of rich and
poor, as advocated by Boix, 2003 and Acemoglu and Robinson, 2001. In this model, suppose that the rich form dictatorships to safeguard and increase their individual wealth.
These ruling elites would have no incentive to implement a democracy, as they would
have to fear the median voter to demand a highly confiscating tax system. “It is assumed
that the median voter, who is poor, prefers high taxes in order to redistribute wealth.
The more unequal the income distribution, the poorer the median voter and thus the
more confiscatory the tax rate can be expected to be in a democracy.” (Geddes, 2007,
p. 322) A second model focuses on the divide between rulers and ruled. In this model
the rulers see their position as a means to enrich themselves by taxes set at a rate as
confiscatory as possible without putting citizens off from economic effort. Democratic
concessions are only granted incrementally in order to make their commitment to providing social goods “and other policies that will increase economic growth” more credible.
(see Geddes, 2007, p. 322) Geddes concludes that “the conflict between the rulers and
the ruled are more plausible when applied to recent struggles over democratisation in
Africa (...)” (Geddes, 2007, p. 323). The
fear of of redistributive taxation is not a plausible reason for resistance to
democratization since substantial portions of productive assets were state or
foreign owned for much of the late twentieth century. State elites who control
a large portion of productive assets may certainly fear loss of power since it
will dispossess them, but they will not suffer less dispossession because the
income distribution is more equal. (Geddes, 2007, p. 325)
Contrary to Geddes’s belief, new data and analysis in this paper now cast doubt on the
usefulness of the second model for SSA, as well; it is unable to account for the negative
coefficient on per capita GDP. It is in the rulers’ best interest to increase economic growth
21
Whilst it is unclear from the coding of the data whether regime changes from one autocracy to another
autocracy take place, we can certainly say when a democracy would have emerged (and in fact, this
is all we care about right now).
16
Dynamic Probit
as they would then be able to create a higher revenue from the tax for themselves. At
the same time we have learned that they would make democratic concessions should
the need for them arise. This runs counter the negative coefficient of per capita GDP
however, as it suggests that the richer a country becomes, the less likely a democracy is
to be implemented. Moreover, this scenario is unlikely to have substance in practice. In
order for this model to explain the coefficient, a substantial amount of the tax systems in
SSA would have to conform to the scenario set up in the model. In reality, however, the
tax efforts vary largely between countries, as does their economic structure. It has been
shown that factors such as the share of agriculture, mining, exports and the amount of
aid payments impact significantly on the tax rates (see Therkildsen, 2005, p. 45). Now
one might be able to argue that a ruler would have to adapt the tax rate to exactly
those factors in their respective country, but here a third problem comes in. How do we
know that the tax rate is as confiscatory as possible and not just or even substantially
lower? This would require an in depth macroeconomic analysis of all countries which is
beyond the scope of this work. So the explanation is and remains ambiguous at best.
To return to the influence of per capita GDP, let us have a closer look at its impact on
the probability of being a democracy in any given year across countries. The coefficients
in table 2 tells us how less likely any African country is to transition to democracy
or to sustain democracy in any given year at any given income. The question is now,
whether the level of income influences the probability of being a democracy. Figure 4 is
the result of a first difference analysis which looks at how a country’s probability to be
a democracy in any given year changes if per capita GDP moves from decile to decile
(the value at 10 shows the change from percentile 1 to percentile 10, and so on). The
emerging pattern is very clear: Firstly, all probabilities are negative. There is no income
for which per capita GDP would be conducive to a country being a democracy. Secondly,
the first difference of the first decile is greatest, diminishes slightly in the second one,
and then plummets to around -0.005 for subsequent deciles. This suggests that amongst
the poorest countries, a rise in GDP has a greater negative effect on the probability of
a country to be a democracy, than in richer countries. Considering that the distribution
of GDP is strongly right-tailed (see figure 5), this pattern has significant implications
for the majority of the countries in SSA.
Whether this income is derived from oil exports does not seem to matter as the coefficient of fuel exports is insignificant. This implies that the export of oil does not
inhibit democratisation in SSA. It is believed “that growth based on the export of oil
17
Dynamic Probit
Figure 4: dProb(y=1) modeled on decile changes in per capita GDP
Figure 5: Distribution of per capita GDP
18
Dynamic Probit
and minerals fails to bring about the social and cultural changes that tend to produce
democratic government” (Ross, 2001, p. 323) and thus lacks the “modernisation effect”
economic development is theorised to induce. This hypothesis is falsified for SSA by the
insignificant coefficient of fuel exports. Again, these findings are at odds with earlier
analyses, as for example Ross finds the dummy variable of SSA in exploring the relationship between resource wealth and democracy to be significant (see Ross, 2001, p.
345). However, fuel exports are only looking at oil, whereas Ross also includes minerals
in the model which might explain a different outcome.
The coefficient of Official Development Aid has a similarly interesting story to tell. The
coefficient is positive, suggesting that a 10% increase in ODA payments increases the
probability of a democracy to emerge by about 1.5%. It might seem odd that contrary to
per capita GDP this coefficient is positive. But unlike GDP, ODA payments are usually
subject to certain conditions being met by the receiving country. As Allen illustrates,
“[towards] the end of the 19080s (...) borrowers were left in no doubt that the development of formal democratic systems (notably the holding of of competitive elections),
and attempts to achieve accountability and the rule of law, administrative probity and
good governance, would be regarded as essential for loan eligibility.” (Allen, 1995, p.
312) Empirical evidence suggests that these conditions do have practical implications in
the transformation of African countries.
All other variables that have been tested for – degree of urbanisation, life expectancy
at birth, gross enrollment in primary schooling, size of the country, ethnic and religious
fractionalisation – have no impact on democratisation in SSA. Especially the results for
the first two variables are very surprising. Health and education are not only mutually
reinforcing components of human capital, they also enable any person to participate in
everyday’s life. This is certainly a precondition for the societal changes modernisation
theory proposes to foster democracy. Evidence suggests that in SSA these factors are
of no relevance. The question is – again – why? It might be possible that social change
has not yet taken place – or at least not to a degree large enough to impact on a change
in political regimes. Whilst the latter explanation is difficult to answer as there is no
absolute level of any variable or a combination of variables where social change is known
to be large enough, we can conclude from the figures that social change has indeed taken
place. The degree of urbanisation has nearly tripled in the period from 1960 to 2002
(15.7% in 1960 and 36% in 2002), and gross primary school enrolment has more than
doubled from 35% in 1960 to 86% in 2002. Life expectancy at birth has increased from
19
Dynamic Probit
41 to 51 years during this period. More fruitful would be Huntington’s gap hypothesis
to account for the absence of significance in these variables. Social change might well
have taken place, but has been supercompensated by too high expectations amongst the
population which leads to the effects and outcome outlined above.
The survival of democracy in SSA is determined by per capita GDP and ODA payments. When we look at the coefficients for democratic survival, a pattern familiar from
democratic emergence arises. Per capita GDP and ODA payments are the only two
significant variables in the analysis. So again, factors that have traditionally been in the
basket of modernisation theorists have no bearing on sustaining a democracy in SSA:
degree of urbanisation, life expectancy at birth, gross enrollment in primary schooling,
size of the country, ethnic and religious fractionalisation end up being insignificant. As
for the two significant variables, the coefficient of per capita GDP is negative, suggesting a reduction of the chance of a democracy to survive by about 3% with every 10%
increase in per capita income. Again, Huntington’s gap hypothesis serves very well to
explain this phenomenon. Appropriate advances might have been made in the political
system to classify a country as a democracy, but people might still be dissatisfied with
the rate of progress the country makes which leads to the regime to collapse. At this
stage, the argument becomes difficult, however, as the index of democracy used in the
analysis does not take into account a participation dimension. From the index alone it
is therefore not obvious if the appropriate channels for participation are in place. If the
gap theory applies, then it would be reasonable to assume that they are not.
What is probably even more noticeable in this part of the analysis, however, is the
negative coefficient of ODA payments. Even though these payments are conducive to
bring democracy about, they seem to harm the chances of sustaining democracy. The
magnitude of the influcence is close to that of per capita GDP. It would therefore seem,
that whilst the conditions to which these payments are bound very well serve to set the
ground for democracy, these conditions lose their lever once the democracy has been
established. It would now be interesting to look more closely at the micro-processes
which are triggered by ODA payments in democratic emergence and to look at what
they fail to do after transition has occured, but this is beyond the scope of this study.
20
Confounding modernisation? – Confounding modernisation!
Confounding modernisation? – Confounding
modernisation!
A look at the democratisation process in SSA throws up many interesting questions.
Why are some countries democracies despite being much poorer than some autocracies
in this region? However interesting this question is, practically no effort has been made
to date to address this question. What is more, a simple grip into the standard methodological tool box would provide little insight into the processes at work, as the availability
of data for SSA is very low. Even for standard indicators of economic development, data
coverage for SSA is very sparse, with some indicators showing as little as 19.3% coverage.
Listwise deletion which is used in most software packages renders the analysis inefficient
and is also likely to introduce bias. In a short literature review it has become obvious
that this caveat has not been taken into account properly in even the most recent major
studies on modernisation. Most of these focus on global data sets but even for these the
problem of missing data is a crucial one. Out of 193 UN member states in the world, 44
are African countries. So about 20% of the countries are in danger of dropping out of
global analyses due to high degrees of missingness in the data and the ensuing process
of listwise deletion in statistical software packages. As these countries are amongst the
poorest countries worldwide, the bias introduced in these studies is huge. In order to
rectify this problem for the assessment of modernisation theory in this paper and to provide a robust and solid analysis of SSA as a geographical entity, multiple imputation has
been used to create 50 rectangular data sets. Estimation with these data sets overcomes
the issues other available methods of filling missing data entail.
A Markov Transition Model has been chosen as the most appropriate methodology for
the analysis. The assessment of existing and now extended data has thrown up some
very interesting findings that not only run counter theory, but also against findings in
previous studies. As these results come out of more and complete data, this fact is not
surprising on the surface, but the implications of these findings are incisive.
It is reasonable to conclude that modernisation theory can indeed be declared confounded
for SSA. For a start, per capita GDP influences democratic transitions. This has confimed the findings of Boix and Epstein, but runs counter the findings of Przeworski et al..
What’s different from all these studies, however, is the negative influence of per capita
GDP on democratic transitions – the richer a country in SSA becomes, the less likely it
21
Confounding modernisation? – Confounding modernisation!
is to become a democracy. This result is even more significant as it is not only at odds
with previous findings (for more information than the studies just cited see chapter ??)
but also with modernisation theory more generally (see chapter ??). Theory proposes a
positive relationship between economic development and democratisation; countries are
supposed to be more likely to become a democracy the more developed they become, and
also more likely to remain a democracy as they develop more. In Africa, the relationship
is negative.
The story is a similar one for democratic survival. Again, per capita GDP has a negative
influence, this time on the probablity of a country to sustain democracy as it advances
economically. As with democratic emergence, once more Huntington"s gap hypothesis
can serve as a possible explanation for the negative relationship. Expectations of the
citizens of a particular country become too high and the demands they place on the
government cannot be satisfied. This disappointment leads to a downfall of the regime.
Theoretically, it is possible to construct this gap for both the birth of democracy and
its survival. Yet, this hypothetical construct needs to be tested empirically. Equally
sursprising in this part of the analysis is the negative influence of ODA payments on
the survival of democracy. Whilst a positive influence on democratic transitions makes
perfect sense, a negative one on its survival is puzzling. The conditionality most of
these payments are bound to would suggest not only to increase the chances of bringing
democracy about, but also to safeguard it once it has been established. Again, there are
manifold explanations to account for this direction of the relationship. They range from
a poor record of these conditions actually being implemented, to a misfit of the ODA
conditions with the situation a country is currently in.
For both democratic transitions and democratic survival other, non-monetary indicators such as life expectancy at birth, or average years of schooling have been returned
insignificant in the analysis. Why is that the case? Why do so many explanations that
sound perfectly reasonable in theory not have any leverage in reality? Whilst this chapter has offered a few attempts to make sense of the results, these remain guesswork and
a lot of questions remain open, calling for additional investigation.
22
Appendix I:Data Sources
Appendix I:Data Sources
Variable
Label
Source
countryname Name of the Country
n/a
imfcode
Boix, 2003
Country Code by IMF
countrycode Own Country Code
n/a
land
Land area (sq. km)
WorldBank, 2009
year
Year of observation
n/a
reg
Regime type according to Przeworski et Przeworski et al.
al. (1=autocracy and 0=democracy)
reglag
Regime type, lagged by one year
rgdpch
Real GDP per capita (2005 constant Heston et al., 2009
prices)
pop
Population (in 1,000)
Heston et al., 2009
agri
Average share of agriculture over GDP
Boix, 2003
yrtsch
Average years of schooling
Boix, 2003
primsch
Primary school enrollment (gross)
WorldBank, 2010
conc
Index of export concentration
Boix, 2003
fuelx
Fuel exports (% of merchandise exports)
WorldBank, 2010
life
Life expectancy at birth
WorldBank, 2010
infant
Mortality rate, infant (per 1,000 live
births)
WorldBank, 2009
odac
Official development assistance and offi- adapted
from
cial aid (2005 constant prices)
WorldBank, 2009
To be continued on next page.
23
Przeworski et al.
Appendix II: Multiple Imputation
Variable
Label
Source
urban
Urban population (% of total population)
WorldBank, 2009
alesinae
Ethnic fractionalisation
Alesina et al., 2002
alesinar
Religious fractionalisation
Alesina et al., 2002
alesinal
Linguistic fractionalisation
Alesina et al., 2002
colony
Colonial Background
own coding
Table 3: Codebook and Sources
Appendix II: Multiple Imputation
Multiple imputation creates m complete data sets, where usually m = 5 is sufficient.
All of these data sets contain the same observed values, but the imputed values vary
across the data sets “to reflect uncertainty levels.” (King et al., 2001, p. 53) These data
sets can then be used to apply the statistical method appropriate for the context of the
respective research and to “estimate some Quantity of interest, Q, such as univariate
mean, regression coefficient, predicted probability, or first difference in each data set
j (j = 1, . . . , m). The overall point estimate q̄ of Q is the average of the m separate
estimates, qj :” (King et al., 2001, p. 53)
j=1
1 X
qj
q̄ =
m m
(2)
The standard error of the multiple imputation point estimation is made up of two
parts.
Let SE(qj ) denote the standard error of qj from data set j, and let Sq2 =
Pm
2
j=1 (qj − q̄) /(m − 1) be the sample variance across the m point estimates.
Then, as shown by Rubin [Rubin, 1987], the variance of the multiple imputation point estimate is the average of the estimated variances from within each
completed data set, plus the sample variance in the point estimates across
the data sets (multiplied by a factor that corrects for bias because m < ∞):
(King et al., 2001, p. 53, see also Schafer and Olsen, 1998, pp. 18-19)
24
Appendix II: Multiple Imputation
m
1
1 X
2
2
SE(qj ) + Sq 1 +
SE(q) =
m j=1
m
2
(3)
Two general algorithms are around to solve the estimation problem. These are ImputationPosterior (IP), an algorithm based on MCMC and Estimation Maximation (EM) of
which the latter is chosen for solving the imputation problem of the data set at hand.
The “idea of the EM algorithm is marvelously and beguilingly simple.” (Gill, 2008, p.
309) Suppose we have a data matrix Y of which a certain fraction Ymis is missing at
random (MAR)22 . The rest of this matrix is observed and labeled Yobs . Essentially, this
breaks the distribution function f (Y |θ) up as follows:
f (Y |θ) = f (Yobs , Ymis |θ) = f (Yobs |θ)f (Ymis |Yobs , θ)
(4)
where θ is an unknown k-dimensional coefficient vector of which we would like to obtain
the posterior distribution (see Gill, 2008, p. 310). The EM algorithm first fills the
missing data in with a temporary, reasonable guess (for details on what “reasonable”
means in this context, see Schafer, 1997, p. 39). The algorithm then proceeds as if
the data were now complete and estimates the parameters θ. In a second step, the
parameter estimates are used to find better guesses for Ymis , the data that was initially
missing. This interaction is best demonstrated by the term f (Ymis |Yobs , θ) in equation 4
which can be seen as the “predictive distribution of the missing data given θ” (Schafer,
1997, p. 38). The iteration between θ and Ymis is repeated until the algorithm reaches
a stationary point – in well-behaved problems a global maximum where EM “yields the
unique maximum-likelihood estimate (MLE) of θ, the maximizer of l(θ|Yobs ).” (Schafer,
1997, p. 39). There are various reasons why this global maximum can potentially not
be reached, a problem that also needed to be solved for the data set on SSA (see below).
For a general discussion of possible reasons for non-convergence, see Schafer, 1997, pp.
51-55.
Figure 6 illustrates the process of multiple imputation using the software Amelia II
(Honaker et al., 2011). It shows that as a first step, the software – which is used to
generate the imputations here – adds to the classic EM algorithm by bootstrapping the
22
Multiple imputation models usually assume MAR to hold. See King et al., 2001, pp. 50-51 for a
detailed discussion.
25
Appendix II: Multiple Imputation
data for each draw23 . This is a first provision in order to account for the uncertainty
multiple imputation seeks to resemble – possibly the strongest argument against claims of
simply “making data up” (see also ). To further incorporate uncertainty into the process,
the EM algorithm is then run “to find the mode of the posterior for the bootstrapped
data” (Honaker et al., 2010, p. 5, see also further below). As explained above, the
analysis of the data then proceeds as normal and the overall point estimate q̄ is calculated
by equation 2.
Figure 6: Schematic of Multiple Imputation using Amelia, adapted from Honaker et al.,
2010, p. 6
Empirics, Empirics, Empirics
Due to the data structure, the imputation setup is not as straightforward as one might
hope. First, as EM assumes the data to be multivariate normal24 variables are transformed if and as appropriate, for example urban is proportional and therefore restricted
between the logical bounds of 0 and 1. A logistic transformation is applied in this case
23
24
Amelia draws m samples of size n with replacement from the original data set.
For more assumptions, see for example Honaker et al., 2010, p. 4
26
Appendix II: Multiple Imputation
“to make the distribution symmetric and relatively unbounded.” (Honaker et al., 2010,
p. 19)25
It is beneficial to omit those variables with a high degree of missingness which will not
be used in the final model of analysis from the imputation for at least two reasons.
First, due to their high degree of missingness they do not add substantial amounts of
information to the algorithm. Secondly, it would be intuitive to assume that due to
the time-series, cross-sectional (tscs) nature of the data, allowing for imputations to
vary over time will deliver better results than a static setup. Figure 7 suggests that
this is indeed the case: the confidence intervals are slightly smaller when a second-order
polynomial is included. By excluding the variables with the highest degree of missingness
which will not be used in the analysis later on, this leads to a data structure that allows
to introduce a polynomial varying across cross-sections. If one of the other variables
was introduced, some cross sections would entirely lack data for some variables and thus
the imputation would have no values to impute the missing ones. In such a setup, the
algorithm is unable to converge and to find a global maximum, a problem addressed
further above. Thus, the same time pattern would have to be imposed in all countries
which would be far less realistic. The variables used for the imputation are as follows:
year, countrycode, imfcode, countryname, reg, reglag, colony, rgdpch, land, pop, primsch,
odac, urban, fuelx, alesinae, and alesinar.
As the degree of missingness in some variables is quite high (for example in fuelx and
primsch), the variance in coefficients between different data sets is also relatively large.
This in turn affects the standard errors and ultimately the p-value of each coefficient.
In order to take this issue into account, the number of imputations has been set to
m = 50, much higher than usual. An even higher number would of course be possible,
but the changes occurring in both coefficients and p-values become marginal. Lastly,
a 1% ridge prior was added to the setup. Without this prior, the covariance matrix of
the estimated complete data set has repeatedly been non-invertible. This prior adds the
chosen percentage of artificial observations to the data set “with the same means and
variances as the existing data but with zero covariances” (Honaker et al., 2010, p. 23).
This helps to shrink the covariances, but keeps the mean and the variance the same and
thus adds more a priori structure to the data (see Honaker et al., 2010, pp. 22-24).
25
In order to allow a logistic transformation for fuelx, the value for non-oil-exporting countries was set
to 0.0001.
27
Appendix II: Multiple Imputation
Figure 7: Comparison of imputations without (a) and with (b) second order polynomial
for Mauritius
28
Appendix II: Multiple Imputation
There are various diagnostic tools available to judge the quality of the imputations. As a
first step, a look at the respective distributions of existing data and of the imputed data,
can as a minimum provide information as to whether the boundaries of the imputation
make sense (see Honaker et al., 2010, p. 30). Figure 8 (page 30) shows a comparison
of existing data (black) and mean imputations (red) for the variable primsch with the
highest degree of missingness. The imputations are well behaved and stay within the
logical boundaries or what could be reasonably suggested as possible values.
To find out whether the likelihood is well behaved generally and the algorithm reaches
convergence,
the EM chain [can be run] from multiple starting values that are overdispersed from the estimated maximum. The overdispersion diagnostic will
display a graph of the paths of each chain. Since these chains move through
spaces that are in an extremely high number of dimensions and can not be
graphically displayed, the diagnostic reduces the dimensionality of the EM
paths by showing the paths relative to the largest principle components of
the final mode(s) that are reached. (Honaker et al., 2010, p. 33)
Figure 9 illustrates that the algorithm converges nicely.
A question that is yet unanswered is how accurate the imputed values themselves actually are. To make a judgment on this question, it seems necessary to compare the
unobserved values with the imputed ones. Incoveniently, the unobserved values do not
exist per definitionem. What is available, however, are the observed values. Conveniently, these can be used to judge the accuracy of the imputations: the overimpute
diagnostic of Amelia goes through all observed values, pretends them to be missing and
imputes several hundred imputations for each of them. This allows the construction of “a
confidence interval of what the imputed value would have been, had any of the observed
data been missing.” (Honaker et al., 2010, p. 30) Figure 10 constructs 90% confidence
intervals for the variable primsch. The highest possible quality (for a chosen level of
confidence) is reached when all confidence intervals cross with the y = x line – here, the
imputation would perfectly predict the observed value. Accordingly, the quality of the
imputations for primsch is highly satisfactory26 .
26
Repeating the exercise for other variables delivers equally high quality results.
29
Appendix II: Multiple Imputation
Figure 8: Comparison of Densities for “primsch”
Figure 9: Convergence of Algorithm
30
Appendix II: Multiple Imputation
Figure 10: Observed vs. Imputed Values (primsch)
31
Appendix II: Multiple Imputation
Crystal Ball and Fog Machine?
Multiple imputation still does not belong to the standard methods of the social scientist.
In fact, even about 20 years after the seminal article by Rubin in 1977, Schafer and Olsen
still noted that it was – except amongst a few experts – “largely unknown and unused”
(Schafer and Olsen cited in King et al., 2001, p. 50). So a few readers might wonder:
is multiple imputation not a deep look into the crystal ball that then provides some
fancy maths as a fog machine to cover the tracks? Or put slightly less cynically: does
multiple imputation not make data up? As a second issue, there might be doubts about
the reliability of existing data for SSA in the first place.
The question whether this procedure actually makes data up is very easily dealt with.
The short answer is: no. Put into a more elaborate way: it would be if only one single
imputation was used. By using multiple imputations, however, the uncertainty of the
data is reflected, thereby acknowledging that no hard and fast estimate can be given
for any missing value (see Schafer, 1999, p. 8). If we impute only one single value, we
assume that we are 100% sure about this value to be a true reflection of the missing one.
But this value is only a guess; an estimate. And we cannot be sure about its accuracy.
So we better let our data reflect this uncertainty. To achieve this, we create m data sets
in which the existing values are kept, but the imputed values vary across the data sets.
The m data sets created in this process of multiple imputation are then used to deliver
the quantities of int erest which in turn reflect this uncertainty, as well.
Put in a more technical way: Amelia first bootstraps the data. In this process, m samples
of size n with replacement are taken from the original data set. As a second step, the
EM algorithm is run for each sample (see Honaker and King, 2010, p. 576). Due to
the multivariate normal assumption for all variables, D ∼ N (µ, Σ), the imputation of
missing values has the setup of a linear regression:
x̃ij = xobs
˜i
i,−j β̃ + (5)
where x̃ij “[denotes] a simulated missing value from the model for observation i and
variable j, and (. . . ) xobs
i,−j [denotes] the vector of all observed variables in row i, except
variable j (the missing value we are imputing).” (Honaker and King, 2010, p. 576) As
32
Appendix II: Multiple Imputation
due to a finite sample the exact values of µ and Σ are still unknown, the bootstrapping
mechanism provides for estimation uncertainty. Fundamental uncertainty is reflected
in ˜i which comes about because of influences that happen by chance and “that may
influence Y but are not included in X.” (King et al., 2000, p. 349).27
Certainly, accuracy and reliability of data need discussing in quantitative research, and
even more so in the context of developing countries. For the data sets used in this study,
there does not seem to be an assessment of these issues available, however. The mere
fact that they are widely used and form the basis of many quantitative analyses, should
not be reason to believe that their respective quality is necessarily high. A word of
caution is as far as it is possible to go at this point: one should bear in mind, that these
data come from developing countries that are poor and might lack the monetary means
to collect data properly, or that they are at war or in a period of civil unrest; periods
when the focus of the public sector (if any) is not on collecting data. So the answer to
the question whether we can trust even the existing data is a somewhat dissatisfactory:
“It is as good as we are going to get”.
The robustness of the imputed data can be judged as high. Only “[if] a very large
fraction of missingness exists in a data set, then multiple imputation will be less robust,
but listwise deletion and other methods will normally be worse.” (King et al., 2001, p.
57) The overall degree of missingness in the data set used for imputation is 15.5%. Even
if some scholars might judge this as relatively high, the quality of the ensuing analysis
will still be better than previous ones.28
27
As explained above, these imputations are then put into their respective positions, thus creating m
data sets in which the imputed values vary and the observed ones are always the same.
28
This does not affect the argument made about the degree of missingness within individual variables
which led to 50 imputed data sets.The overall degree of missingness is still relatively low.
33
References
References
Acemoglu, D. and Robinson, J. (2001). A theory of political transitions. American
Economic Review, 91(4):938–963.
Alesina, A., Devleeschauwer, A., Easterly, W., Kurlat, S., and Wacziarg, R. (2002).
Fractionalization. NBER Working Paper 9411.
Alesina, A., Oezler, S., Roubini, N., and Swagel, P. (1996). Political instability and
economic growth. Journal of Economic Growth, 1:189–211.
Allen, C. (1995). Understanding african politics. Review of African Political Economy,
22(65):301–320.
Arat, Z. (1988). Democracy and economic development: Modernization theory revisited.
Comparative Politics, 21(1):21–36.
Atkinson, A. B. and Brandolini, A. (2001). Promise and pitfalls in the use of “Secondary”
data-sets: Income inequality in oecd countries as a case study. Journal of Economic
Literature, XXXIX:771–199.
Barro, R. and Lee, J.-W. (2000a). Education attainment in the adult population (barrolee data set). available online at http://go.worldbank.org/8BQASOPK40.
Barro, R. and Lee, J.-W. (2000b). International data on educational attainment updates
and implications.
Beck, N., Epstein, D., Jackman, S., and O’Halloran, S. (2002). Alternative models of
dynamics in binary time-series-cross section models: The example of state failure. Prepared for delivery at the 2001 annual Meeting of the Society for Political Methodology,
Emory University, Draft of July 12, 2002.
Boix, C. (2003). Democracy and Redistribution. Cambridge: Cambridge University
Press.
Clark, W. R., Golder, M., and Nadenichek Golder, S. (2008). Principles of Comparative
Politics. Washington: CQ Press.
34
References
Deininger, K. and Squire, L. (1996). A new data set measuring income inequality. The
World Bank Economic Review, 10(3):565–591.
Epstein, D. L., Bates, R., Goldstone, J., Kristensen, I., and O’Halloran, S. (2006).
Democratic transitions. American Journal of Political Science, 50(3):551–569.
Geddes, B. (2007). What causes democratization? In Boix, C. and Stokes, S. C., editors,
The Oxford Handbook of Comparative Politics. Oxford: Oxford University Press.
Gill, J. (2008). Bayesian Methods - A Social and Behavioral Sciences Approach. Boca
Raton: Chapman & Hall/CRC, second edition.
Greene, W. H. (2003). Econometric Analysis. Upper Saddle River, NJ: Pearson Prentice
Hall.
Helliwell, J. F. (1992). Empirical linkages between democracy and economic growth.
NBER Working Papers Series.
Heston, A., Summers, R., and Aten, B. (August 2009). Penn world table version 6.3,
center for international comparisons of production, income and prices at the university
of pennsylvania. available online at http://pwt.econ.upenn.edu/.
Honaker, J. and King, G. (2010). What to do about missing values in time-series crosssection data. American Journal of Political Science, 54(2):561–581.
Honaker, J., King, G., and Blackwell, M. (2010). Amelia ii: A program for missing data,
version 1.5.
Honaker, J., King, G., and Blackwell, M. (2011). Amelia II: A program for missing data.
Journal of Statistical Software, 45(7):1–47.
Huntington, S. P. (1968). Political Order in Changing Societies. New Haven and London:
Yale University Press.
Huntington, S. P. (1971). The change to change: Modernization, development, and
politics. Comparative Politics, 3(3):283–322.
35
References
Imai, K., King, G., and Lau, O. (2007). Zelig: Everyone’s statistical software. available
online at http://GKing.harvard.edu/zelig.
Imai, K., King, G., and Lau, O. (2008). Toward a common frame- work for statistical analysis and development. Journal of Computational and Graphical Statistics,
17(4):892–913.
King, G., Honaker, J., Joseph, A., and Scheve, K. (2001). Analyzing incomplete political science data: An alternative algorithm for multiple imputation. The American
Political Science Review, 95(1):49–69.
King, G., Tomz, M., and Wittenberg, J. (2000). Making the most of statistical analyses:
Improving interpretation and presentation. American Journal of Political Science,
44(2):341–355.
Lerner, D. (1973). Modernization: Social aspects. In Sills, D., editor, International
Encyclopedia of the Social Sciences, Vol. 9. New York: Collier, Macmillan.
Lipset, S. M. (1959). Some social requisites of democracy: Economic development and
political legitimacy. The American Political Science Review, 53(1):69–105.
Marshall, M. G. and Jaggers, K. (2010). Polity iv project: Political regime characteristics and transitions, 1800-2008. available online at http://www.systemicpeace.org/
polity/polity4.htm.
Payne, A. and Phillips, N. (2010). Development. Cambridge: Polity Press.
Przeworski, A., Alvarez, M. E., Cheibub, J. A., and Limongi, F. Democracy and development extended data set. available online at http://politics.as.nyu.edu/object/
przeworskilinks.html.
Przeworski, A., Alvarez, M. E., Cheibub, J. A., and Limongi, F. (2000). Democracy
and Development - Political Institutions and Well-Being in the World, 1950-1990.
Cambridge: Cambridge University Press.
Ross, M. (2001). Does oil hinder democracy? World Politics, 53(April):325–361.
36
References
Rubin, D. (1977). Formalizing subjective notions about the effect of nonrespondents in
sample surveys. Journal of the American Statistical Association, 72(September):538–
543.
Rubin, D. (1987). Multiple Imputation for Nonresponse in Surveys. New York: Wiley.
Sandbrook, R. (2000). Closing the Circle: Democratization and Development in Africa.
Toronto: Between the Lines.
Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. London: Chapman and
Hall.
Schafer, J. L. (1999). Multiple imputation: a primer. Statistical Methods in Medical
Research, 8.
Schafer, J. L. and Olsen, M. K. (1998). Multiple imputation for multivariate missing-data
problems: A data analyst’s perspective. Multivariate Behavioral Research, 33(4):545–
571.
So, A. Y. (1990). Social Change and Development - Modernization, Dependency, and
World-System Theories. London: Sage Publications.
Therkildsen, O. (2005). Understanding public management through neopatrimonialism:
A paradigm for all african seasons? In Engel, U. and Olsen, G. R., editors, Contemporary Perspectives on Developing Societies – The African Exception, pages 35–52.
Hants: Ashgate Publishing Ltd.
Todaro, M. P. and Smith, S. C. (2006). Economic Development. Harlow: Pearson, ninth
edition.
WorldBank (2009). African development indicators. available online via ESDS, http:
//www.esds.ac.uk/.
WorldBank (2010). World development indicators, edition: September 2010. available
online via ESDS, http://www.esds.ac.uk/.
Zagha, R., Nankani, G. T., and WorldBank (2005). Economic growth in the 1990s:
37
References
learning from a decade of reform. Washington D.C.: The International Bank for
Reconstruction and Development / The World Bank.
38