The effect of nonrandom selection of clusters in a two stage cluster

Wayne State University
DigitalCommons@WayneState
Wayne State University Dissertations
1-1-2012
The effect of nonrandom selection of clusters in a
two stage cluster design
Jason Christopher Parrott
Wayne State University,
Follow this and additional works at: http://digitalcommons.wayne.edu/oa_dissertations
Recommended Citation
Parrott, Jason Christopher, "The effect of nonrandom selection of clusters in a two stage cluster design" (2012). Wayne State University
Dissertations. Paper 579.
This Open Access Dissertation is brought to you for free and open access by DigitalCommons@WayneState. It has been accepted for inclusion in
Wayne State University Dissertations by an authorized administrator of DigitalCommons@WayneState.
THE EFFECT OF NONRANDOM SELECTION OF CLUSTERS IN A
TWO STAGE CLUSTER DESIGN
by
JASON PARROTT
DISSERTATION
Submitted to the Graduate School
of Wayne State University,
Detroit, Michigan
in partial fulfillment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
2012
MAJOR: EVALUATION & RESEARCH
Approved by:
______________________________
Advisor
Date
______________________________
______________________________
______________________________
______________________________
© COPYRIGHT BY
JASON PARROTT
2012
All Rights Reserved
ACKNOWLEDGEMENTS
I would like to thank my advisor, Dr. Shlomo Sawilowsky whose support and guidance
helped make this dissertation possible. His willingness to meet and discuss the
foundations of my dissertation is greatly appreciated. I would also like to thank my
committee: Dr. Gail Fahoome, Dr. Barry Markman, and Dr. Hermina Anghelescu who
supported me in the completion of my dissertation. I appreciate the flexibility and
guidance that my committee provided.
ii
TABLE OF CONTENTS
Acknowledgements…………………....………………………………………………………..ii
List of Tables………………………..………..…..…………………………...........................iv
List of Figures………..…………………………………………….…………………………....v
Chapter One: Introduction………………...………………..……………………………….....1
Chapter Two Literature Review………………………………………………………………..6
Chapter Three Methodology….…….……………………………………………………......32
Chapter Four Results………………………………………………………………………….35
Chapter Five Conclusions and Recommendations….………………………………..…...43
Appendix A Formulas………………………….……………………………………………...48
Appendix B Rho Chart………………………….………………………………………….….58
Bibliography……………………………………………………………………………….…...62
Abstract…………………………………………………………………………………………70
Autobiographical Statement…………………………………………………………………..71
iii
LIST OF TABLES
Table 4.1: Rho Table……………………………………………..………………………...…36
Table 4.2: Random vs. Purposeful Lower Limit Cluster Results……………...……….....38
Table 4.3: Random vs. Purposeful Upper Limit Cluster Results………………………….40
Table 4.4: Proportion of purposeful cluster over random cluster…………………….41-42
iv
LIST OF FIGURES
Figure 4.1: Random vs. Purposeful Lower Limit Cluster Results…………………….......37
Figure 4.2: Random vs. Purposeful Upper Limit Cluster Results...................................39
Figure 4.3: Proportion of purposeful cluster over random cluster………………………..41
v
1
CHAPTER I
INTRODUCTION
It would be ideal to test every subject within the target population when conducting
experiments. This type of research can often prove to be too cumbersome, unmanageable,
and unrealistic. A more reasonable, although less accurate, approach is to take a sample from
the desired population and conduct an experiment on the sample. Inferences can then be
made regarding the characteristics of the population. Sampling reduces costs and improves
speed in conducting experiments (Cochran, 1977).
There are many sampling procedures that can be used to represent populations. Each
procedure is based on certain principles and assumptions. One of the principles that sampling
must adhere to is randomization, which is the process that gives each subject in the population
a non-zero chance of being selected (Weisberg, Krosnick, & Bowen, 1996).
In the context of experimental design, randomization is an important step in ensuring
that the variability in participants is equally distributed into treatment groups thus eliminating
the possibility that any portion of the population will be overrepresented (Weisberg, Krosnick, &
Bowen, 1996). Randomization is essential in both the selection and assignment processes.
The selection of participants must occur before assignment can occur (Runyon, Coleman &
Pittenger, 2000).
Selection is the process by which participants for a study are chosen (Campbell &
Stanley, 1963). The treatment group receives the intervention that the researcher is attempting
to study while the control group receives no treatment and is held constant. Random selection
is essential for generalization of the study. When random selection is done correctly, the
2
results of the study can be applied to the population as a whole within the significance level set
by the researcher.
Assignment is the process of deciding how the treatment will be distributed (Runyon,
Coleman & Pittenger, 2000). It is possible to either assign treatments to the groups or to
assign the groups to the treatment. When random assignment is violated, the study loses
internal validity, meaning that the researcher cannot be certain the results obtained were due
to the intervention or to an undefined extraneous variable.
Ceteras paribus, simple random sampling provides for the most accurate inferences of
sampling procedures, and is often the easiest to conduct. However, it may not be practical in
some research contexts. For example, a study involving the sampling from the population of
the United States can be more efficiently conducted defining a sampling frame of groups or
clusters, and then selecting participants from those clusters. The problem with this method is
that the savings in time and cost generally comes at the loss of statistical efficiency. The
dilemma that the researcher is left to solve is when to use simple random sampling as
opposed to cluster sampling. Kish (1965), suggested that when the lower cost per element of
sampling outweighs the increase in variance and the problems associated with statistical
analysis that cluster sampling is the more realistic choice. This scenario often occurs in large
widespread samples.
Once it has been determined that the researcher is going to use a cluster sample, it is
important that the clusters are well defined (Sudman, 1976). A desirable cluster is determined
by the researcher’s objectives (Kish, 1965). The researcher’s participants can either be
organized or naturally placed into clusters that share common characteristics. Clusters can be
3
formed for many different purposes, including medical practices, school classroom, voting
districts, counties, states, etc.
Once the participants are assigned, and divided into clusters, the researcher can then
test the entire cluster or can draw another sample from the cluster. In the latter method,
individual participants are randomly selected to represent the entire cluster. These participants
are tested and then the results are analyzed according to the study parameters. This form of
cluster sample is called a two stage cluster sample. Cluster sampling must adhere to the same
rules as individual sampling with respect to randomization at all levels of the study. This
process can be violated when the researcher wants to ensure certain participants are included
in the study.
For example, Michigan has 83 counties that represent the state. Each of the counties
could serve a defined cluster when examining the state and if each were selected at random
this would be a valid experiment. However, the majority 4,052,201 (40%) of the population in
the state is located in the tri-county area of Wayne, Macomb, and Oakland counties (Michigan
Governmental
Website.
(July,
21,
2010).
Retrieved
July
21,
2010
from
http://www.michigan.gov/cgi/0,1607,7-158-54534-240589--,00.html).
Because a considerable amount of the population, wealth, and state interests are
located within these three Michigan counties, they are frequently automatically included in
many studies and the participants from the remaining counties are then randomly selected to
complete the sample. Due to a variety of possible reasons (i.e lack of understanding, political
pressure, etc.), it appears that some researchers believe that a study that did not include these
three counties would be discounted or considered irrelevant.
4
This same scenario occurs in the state of New York. During 2002-2006, there was an
average population of 19,228,641. Of those, 8,177,449 (43%) lived in the five boroughs of
Bronx, Brooklyn, Manhattan, Queens, and Suffolk (New York Governmental Website.
Department
of
Labor.
(July
1,
2009).
Retrieved
July
21,
2010
from
http://www.labor.ny.gov/stats/nys/statewide_population_data.asp). It is the belief of some that
a study of the State of New York that did not include these highly populated, well known
counties, would not be considered as important as one that included them.
Some researchers would argue that automatically including these New York counties
would be considered valid as long as the individual participants are randomly selected from the
clusters. However if at any point of a study, the principles of randomization are violated, that
sampling frame is no longer representative of the entire state, and conclusions regarding that
population are at least to some nonarbitrary degree, invalid.
Statement of the Problem.
The question that this study will answer is: To what extent is the validity of purposeful
cluster samples studies compromised? More specifically, it is the purpose of this study to
demonstrate how the principles of randomization are violated in two-stage cluster sampling
and how much this violation affects the results of the studies. The study will examine the
impact of cluster sampling when randomization of selection of clusters has been violated. More
specifically, it is concerned with Type I error properties (false positives) that may occur after
failure to randomly select clusters in the first stage of a two stage cluster design.
5
Limitations:
In order to conduct the simulation, there needs to be a distribution from which to draw
data. In this study, a normal distribution will be used. A normal curve is not necessarily the best
representation of real data sets. However, it will permit a close view of the best case scenario.
Hence, if the results are unsatisfactory for normally distributed data, perforce the substitution of
real data will yield ever less satisfactory results.
6
CHAPTER 2
LITERATURE REVIEW
When conducting experiments, the researcher is often manipulating an independent
variable with the dependent variable in order to see if it produces a significant change. In order
to improve the power of the study, the researcher may match samples or use stratification to
help ensure the samples have baseline equality. In the process of doing so, the sample can
become more restricted to meet the parameters of the design, and as a result, the internal
validity of the study may be compromised.
This problem can become confounded with cluster sampling, which is a complex
procedure because the researcher has to account for the variance between clusters as well as
the variance between individual participants. The purpose of this review is to examine what
cluster sampling is and how it differs from the sampling of individual participants. Unique
differences such as unit of analysis, intercluster correlation, sample size, and effect size will be
discussed in greater detail as well as equations to account for these differences. The fact that
these procedures are well known and still not being incorporated will also be examined.
Procedures used to increase the power of statistical tests following cluster sampling will also
be discussed, and how these procedures can influence the internal validity of the results and
the external validity of the study. Finally, the review will discuss a brief history of how cluster
sampling evolved to its current state and how accurately it has being incorporated recently.
The final discussion will relate to the purpose of this study.
Uniqueness of Cluster Designs: Cluster randomized statistical designs or experiments
are designs where clusters of individuals are randomly assigned to treatment and control
groups rather than the individuals themselves (Donner, 1998). A two stage cluster sample is a
7
cluster sample where clusters are either selected randomly or by design and from those cluster
individuals are then selected to represent the clusters. According to Donner and Klar (2000),
there are three commonly used designs when setting up cluster randomized trials. They are as
follows: completely random, stratified, and matched pair .
The completely randomized cluster design assigns intact clusters without consideration
to other factors (Donner, 1998). For example, if researchers in a state were examining whether
to enact a reading program for its students and to test the effects of the program, they could
use a completely randomized design. To accomplish this, they would randomly select districts
from each state and assign them to an experimental group and to a control group. If each
district in the state had an equal chance of being selected without any predisposed criteria it
would be a completely randomized design.
The completely randomized design is the most simplistic yet most powerful design to
set up. This design is most appropriate for a large number of clusters (Donner & Klar 2000). It
is also easy and inexpensive, for sample selection, data analysis, and sampling variance
(Sudman, 1976). The largest drawback to this design is that although it is theoretically and
mathematically the most efficient, it is often logistically difficult to do. It can be a long and
tedious process to construct this design if the sample size or population are large (Sudman,
1976). An example of this form of design is the ACEH trial conducted by Abdeljaber, (1991). In
this study, 229 villages from a sample of 450 were selected and given vitamin A supplements.
They were then compared against a control group to determine the effectiveness of the
supplements on various health factors. Because selection was taken at random with no other
predisposed criteria, it is considered a randomized design. This study differs from a
8
stratification or matching study because the sample was not further categorized from the
original selection.
A stratified cluster design is a more stringent cluster design that assigns two or more
clusters to some combination of some statistical subpopulation or intervention (Donner & Klar,
2000). This subpopulation can be any population that seems to be a relevant contributor to the
study. For instance, some cluster designs are stratified according to factors such as cluster
size, geographic area, or socioeconomic status (Donner, 1998). Using the previously
discussed example on reading programs, if the districts were divided according to size before
they were randomly selected and assigned this would be a form of stratification.
Stratification designs are most effective in small studies (Donner and Klar, 2000). When
used correctly, stratification nearly always results in smaller variance for the estimated mean or
total than is given by the simple random sample (Cochran 1977). This reduction in variance
can make the sample more efficient (Sudman, 1976). There are four primary reasons to
stratify. They are as followed:
1.
2.
3.
4.
Strata themselves are of primary interest
Variances differ between strata
Costs differ between strata
Prior information differs by strata (Sudman,1976).
There are no disadvantages to using stratification, however, unless the units of
stratification are exceedingly large and only small amounts of variation remain within the strata,
the gains will only be moderate at best (Hansen, Huirwitz & Madow, 1953). Although there are
no disadvantages, there are certain situations where stratification should not be used as a
means to strengthen a design. Stratification should not be used for the following reasons:
9
1. To ensure randomness
2. With non probability samples
3. Adjusting for non cooperation (Sudman, 1976)
An example of a stratification design is the Child and Adolescent Trial For
Cardiovascular Health or CATCH trial (Perry, et. al, 1997). In this trial, the objective was to
compare the change in serum cholesterol levels from the baseline to the end of the
intervention period. Students were assigned to either a school-based intervention, school
based and family based intervention, or the standard curriculum. The school-based
intervention was CATCH curricula, means to provide for more nutritional meals, and health
based classroom activities. In addition to this, the group who received the additional family
based intervention also received at home activities. The stratifying factor was geographic
areas or cities with each city contributing 24 schools to both the control and experimental
groupings (Donner & Klar, 2000). In this design, the researchers had the flexibility to determine
by which factors to stratify. This flexibility does not exist in the matched pair design.
The matched pair design is a most stringent design of the three. In a matched design,
two clusters in a stratum are randomly assigned to an experimental group and a control group
(Donner, 1998), With a matched pair design, the presumed extraneous variables form a tight
match reducing imbalance of baseline risk factors. The strength of the matched pair design is it
can lead to an increase in power for the study. The power of the matched design will continue
to improve as the effectiveness of the match increases (Donner and Klar, 2000). However, the
matched pair design has many disadvantages. It is difficult to determine the inter-cluster
correlation between match pairs. Also, if the matching variables are not related to the outcome
there may actually be a loss in power due to loss of degrees of freedom (Klar & Donner, 2000).
Finally, there is also the concern that if participants are being matched according to one
10
variable they may differ widely in many other unknown variables. In general, matching should
be avoided in studies with a small sample. They typically work better for designs that have
larger than 10 pairs (Donner & Klar, 2000). In the hypothetical example used earlier, two
districts with similar previously agreed upon criteria would be matched then randomly assigned
to a control and experimental group as in the following example.
Royce et al. (1993) examined the smoking cessation attempts between African
Americans and Caucasian Americans. They used baseline data from the COMMIT research
group which was designed to match groups according to the following factors: community size,
population density, demographic profile, community structure, and geographic proximity.
These matching factors helped form equal groupings among these variables. However, they
could exacerbate differences in other areas. In addition to the uniqueness of each form of
cluster design, cluster designs as a whole differ from other designs in many other statistical
factors.
Cluster designs are unique from their individual design counterparts because the effects
of clustering must be accounted for statistically. When participants are clustered, they may
either contain similar traits or may acquire them as a result of their clustering. These
similarities often cause the participants to not be statistically independent and as a result
cause inflated results when rejecting the null hypothesis (Simpson et al., 1995). The similarities
are unknown but usually occur for one of two reasons: either participants are grouped together
because of similarities not accounted for in the design, or because they have been together for
a long enough period of time that they have influenced each other through discussions or other
actions. These influences have led them to make similar observations or responses.
11
Consider the following example for the former of these two situations. Schools in
separate districts are chosen to partake in achievement testing. Some of the schools are giving
an intervention which is supposed to raise reading achievement and the remainder serves as a
control. In the selection process, many of the more affluent districts are giving the intervention
and it is determined that a significant result has been returned. The result could be attributed to
the students’ higher socioeconomic status or other supplemental interventions the district
offered.
An example of the latter of these two instances often occurs in a workplace
environment. As people work to together for longer period of time, they tend to influence each
other’s decision making. They also have a greater chance of being exposed to the same
extraneous variables. For instance, if one member of a cluster was exposed to an infectious
disease the other groups would be more likely to acquire this disease due to increased
exposure regardless of interventions used (Simpson et al., 1995). This cluster effect has to be
quantified to determine the validity of a cluster design.
In addition to clustering effect, there are other statistical challenges compared to their
non- clustered counterparts. Before explaining the intricacies of each individual design that is
used in cluster randomized trials, it is important to explain the different variables or adaptations
of the standard components that can become an issue in cluster randomized trials if not
addressed. Among these issues that the researcher needs to consider are the unit of analysis,
inter-cluster correlation, sample size, and effect sizes.
Unit of Analysis: Conducting traditional experiments using individual participants is
standard methodology. The researcher randomly samples from a pool of individuals and then
randomly assigns those individuals to a treatment or a control group. The treatment is given to
12
the individual directly and the researcher evaluates the results of that treatment. Often in
cluster designs, the cluster itself is assigned to a treatment or a control group and the
individuals are then given the treatment. The entire cluster is then evaluated as to the success
of the treatment. This presents a problem because it cannot be determined if the change is a
result of the treatment or the effect of the clustering of the individuals. The unit of analysis
differs from the unit of selection and as a result poses a statistical challenge.
Cluster trials have a unit of analysis issue that needs to be addressed. This unit of
analysis problem is a problem that occurs when the level of assignment to study conditions
and the level of analysis of the data differ (Rooney & Murray, 1996). For example, consider
participants attempting to lose weight through an exercise program. The treatment group could
be placed in a class where the intervention is given and compared to that of a control group.
The individuals would then be weighed prior to and at the conclusion of the class. The success
of the program would be judged on the amount of weight loss by each individual participant
even though the class was the unit that was used to assign participants.
The effects that the clustering had on the group could be the cause of its success, not
the program itself. Cornfield (1978), discussed the extent clusters can have on the outcome of
a research study. He demonstrated how to account for the clustering effect involved in group
trials, and to what extent sample size will need to be increased to neutralize this effect. He
concluded that randomization studies by cluster with evaluation at the individual level can yield
information and should not be discouraged. However, when using these studies the analysis
must be appropriate and treating them as standard individual studies “is an exercise in selfdeception and should be discouraged.” (Cornfield, 1978). In order to properly use cluster
designs, the researcher must account for clustering. The first steps in accounting for the
13
clustering effect are evaluations of the interclass correlation coefficient, sample size, and effect
size. Each of these will be discussed in greater detail below.
Intercluster correlation coefficient (ICC) In order to properly account for the clustering
effect, researchers quantify it using an intraclass correlation coefficient (p) (Donner & Klar,
2000). The intraclass correlation or the intercluster correlation is defined as the standard
Pearson correlation between any two subjects in the same cluster (Donner & Klar, 2000). This
correlation can be quantified by using the following formula:
A1.)
There are many reasons for the possibility of variation between clusters including the following:
a.)
Individuals
frequently
select
cluster
to
which
they
share
common
characteristics. For example, census data that are clustered by county would
have similar people living together. A county full of residents in Wayne
County, Michigan could be substantially different than a county full of
residents in Oakland County, Michigan.
b.)
Covariates at one cluster or level affect many of the participants within that
level. For example, smoking cessation participants who live in a highly
industrial area may show increased lung damage compared to their
counterparts. This could occur due to pollution in the area rather than effects
of smoking.
c.)
Participants within clusters have more exposure to each other compared to
other clusters and as a result influence each other. For example voters who
tend to be moderate or in the center on issues may tend to be swayed by an
14
event or organization put on by a particular party. This may occur as a result
of comments made by the other participants in the group (Donner, et al.,
1990).
Interclass correlation values are indicative to the overall success of the intervention. If
these factors are too large, it is difficult to determine if the intervention was the cause of
change in behavior or if an extraneous variable was. Acceptable interclass correlations can
differ according to design of the experience. However, there are some guidelines that have
been established.
In a study conducted by Hedges and Hedberg, (2007), it was determined that the
average level of an ICC value for educational based research was .22. For studies that used
primarily low-socioeconomic schools, that level dropped to an average of .19 and for low
achievement schools the ICC number decreased to .09. ICC for medical research can range
anywhere from .2 to .6 depending on the study (Donner, et al, 1981). The acceptable
Interclass correlation is dependent on the parameters of the study. An ICC value of .6 would
not be acceptable in a study concerning low achieving schools. However it may be acceptable
in some areas of epidemiological research.
Once the correlation coefficient has been determined, the variance inflation factor can
be calculated. The variance inflation factor or design effect results in a loss of statistical
efficiency for the design. This loss can be quantified by the following equation:
A2. D  1  (m  1) p
where m is the number of individuals per cluster and p is the intra-cluster correlation value
(Hayes, et al., 2000). Not only does the researcher have to adjust for the variance inflation
15
factor caused by the clustering of individuals, they must also determine the sample sizes for
two different groupings. This variance between groups is a key determinant in setting up an
experiment. An inflated variance causes decreased power in the study and in turn means that
sample size needs to be increased (Feng, et al., 1999).
Sample size. Sample size determination is a factor of three things that will typically be in
conflict with one and another. They are as followed: cost, practicality, and scientific objectives
(Hayes, et al., 2000). When deciding upon the appropriate sample size, there are two levels to
consider. The first is the sample size of clusters chosen, and the second is the number of
participants chosen per cluster (Campbell, M.J. 2000). In many cases, the researcher cannot
control the number of participants per cluster since many of them are set for him/her already
(Raundebush, 1997). For example, in studying classrooms within schools, the researcher is
limited to the set number of students that are within the class. In determining sample size, it is
important to consider that allocating large numbers of people per cluster can constrain the
amount of clusters that would be allocated (Raundebush, 1997). The rationale behind this is
that selecting a large number of participants per cluster causes the overall amount of
individuals needed for the experiment to increase. At this point, it may be more difficult or
costly to examine or recruit the necessary amount of individuals and in turn it may be more
difficult to fill an adequate amount of clusters. Increasing the number of individuals per cluster
will not necessarily improve efficiency. In determining the number of individuals to cluster, it is
important to remember that this is a prelude to determining the number of clusters that can be
constructed (Raundenbush, 1997).
Inflating the number of clusters also does not ultimately cause increased statistical
efficiency either. Hayes et al., (2000) detailed this phenomenon when discussing an
16
experiment regarding malaria transmission among African villages. In his study, he stated that
malaria, as well as other infectious diseases are quite concentrated from one village to the
next. Using villages as the cluster unit will cause the intraclass correlation to be exceptionally
large and a better cluster unit may be a grouping of villages within a geographic area. The
parameters of the study should dictate the appropriate balance of sample size between
clusters and individuals.
There is no set formula for determining the appropriate sample size for all studies.
However, there are guidelines and formulas derived to assist in the planning of studies. Hsieh
(1988), developed formulae regarding sample sizes in cluster trials. He allowed for sample size
calculations using either the variance between squares or an estimate of the value within
squares. Given this information along with the intercluster correlation, the adequate number of
individuals per clusters can be determined, and using the power contours which are also
provided, the number of clusters can then be calculated (Hsieh, 1988).
Campbell, (2000), also developed sample size formulas. He used the following formula to
determine the number of patients per practice:
A3.)
The formulas may be useful in determining whether to increase the number of practices
or to increase the amount of patients per practice in medical studies. Using the above formulas
and variations of these formulas, the researcher can determine which combination of clusters
and individuals will work for their study. Each design will encompass its own intricacies which
make it unique. The researcher must balance the factors of cost, practicality, and scientific
objectives when determining the appropriate sample size (Hayes et al., 2000). In addition to
17
sample size considerations, the researcher must also calculate the effect size in a different
manner than with an individual design.
Effect Sizes: Much like sample sizes, formulas and sample size allocation are different
for cluster sampling designs as opposed to individual design. The desired effect sizes from
these designs are also altered.
An effect size measures the interventions effect on the
individuals it has been given to. The effect size is the standardized difference between the
treatment and the control group (Rooney & Murray, 1996). By calculating the effect size, the
researcher can determine the sample size needed to achieve the desired power for their study.
Effect sizes can be affected by the cluster effect of individuals within a group. This cluster
effect (ICC) can cause a significant reduction in effect size and needs to be taken into account.
For instance, Rooney and Murray (1996) stated that an ICC as low as .002 can cause the
effect size to be reduced 30% in experiments containing at least 100 students per school. To
account for the clustering effect, they suggest an adjusted effect size using the following
formula:
A4.)
Donner and Klar (2002) suggest using meta-analysis studies to account for the
clustering effect in defining an effect size for cluster randomized trials. They suggest four
methods for obtaining a more accurate effect size when working with binary data in metaanalysis trials. The first approach they suggest is the ratio estimator approach. The ratio
estimator approach developed by Rao and Scott (1992) divides the observed sample
frequencies in a given study by the estimated design effect.
18
The second approach Donner and Klar suggested to evaluate effect size is the Adjusted
Mantel-Haenszel test. This procedure is a commonly known procedure for evaluating binary
data in individually randomized trials. This procedure compares the outcomes of each
individual trial and weighs the differences by their variance. The trials with the most stable
outcomes are more influential than those with the least (Donner & Klar, 2002). This procedure
can be adjusted to fit cluster data, if the clusters and sample sizes are the same size by
dividing the original equation by its inflation factor (Donner & Klar, 2002).
The third procedure that Donner and Klar referenced is the Woolf procedure. It takes
the effect sizes of trials with a small number of trials and a large size and transforms the
intervention odds ratios to a logarithmitic scale. This scale is averaged using a weighting scale
by Woolf (1955) and modified for clustering by Donner and Donald (1997).
The final approach that Donner and Klar (2002) suggested was to use a randomization
procedure such as Fischer’s permutation test. The advantages of this approach are its
statistical validity. However, this comes at the expense of loss of power and the inability to
easily make a covariate adjustment. These methods are used with binary data in metaanalysis trials, but they do demonstrate how the effects of clustering must be taken into
account when determining accurate effect sizes in these trials.
Hedges (2007) developed another method for adjusting effect sizes in cluster
randomized trials. He reasoned that in cluster randomized trials there are several different
mean differences to choose from. Each of these differences will yield a different definition for a
population effect size. The three main effect size parameters that Hedges discusses are the
within mean difference, the between mean difference, and the total mean difference between
19
the treatment and control groups. In order to determine which one will be the most practical is
dependent on the interest of the researcher (Hedges, 2007).
The within mean difference effect size can be defined as:
A5.)
This effect size would be typically used in single site studies. For example, if a school was
determining to enact a certain reading program they may have several classrooms randomly
assigned to receive the program and compare them to a control in which traditional methods
were incorporated. The desired effect size could be determined using the within mean
difference.
The second effect size parameter is defined as:
A6.)
This effect size would be used in studies that have multiple sites but are allocated on the basis
of the individual rather than the cluster. An example of this type of assignment strategy could
be the assignments of students to different high schools within a district. The students could be
assigned to different schools and then into classes from these schools to receive treatment or
be held as the control. The key is that the individual is the level of assignment.
The final effect size parameter that Hedges discussed was
A7.)
20
This effect size parameter could be used to estimate effect sizes where the treatment effect is
to be determined at the cluster level. For example, if students were allocated to different
classrooms (clusters) to evaluate different teaching methods. The mean score of the class
would serve as the test statistic and the effect of the teaching method would be evaluated to
see which produced the most favorable results.
The appropriate definition of mean differences must be chosen before attempting to
achieve desired effect size. Once the appropriate parameter is determined, Hedges (2007)
provided equations to estimate the effect size for the study, and it is from these estimates the
sample size and power needed to achieve the desired effect can be determined. It is essential
to account for the effect of clustering in unit of analysis, sample size, effect size, and power.
Even though it well known these factors need to be addressed, it has been proven through
meta-analysis studies that it is not always done.
These statistical issues mentioned earlier are well known to researchers. However
sometimes they are disregarded or ignored. In a study conducted by Isaakidis and Ioannidis
(2003), It was determined that only 20% of studies in their sample (51) took clustering into
account in their sample size and power and power calculations and only 37% took clustering
into account in the analysis. Intracluster correlations and design effects were only reported in
2% and 6% of the trials respectively.
The previous variables discussed occur in all cluster randomized trials and need to be
accounted for. There are, however, different types of trials that can be designed using cluster
or group randomized trials each of which have their own strengths and weaknesses. Cluster
randomized trials can be done according to randomized, stratified, or matched designs. They
can also be done with or without the use of covariates. The rationale for using stratification,
21
matching, or covariate schemes is to increase efficiency of designs to increase power. Due to
the cost of gathering samples for studies, it becomes essential to use prior information when
available to increase the likelihood of adequate power (Raudenbush, et al. (2007). However, in
deciding to use these schemes, one needs to be careful not to sacrifice the integrity of a valid
sample to increase power.
Prior to treatment, experimental units can be separated or stratified into subclasses
called blocks in which these blocks are perceived to be similar (Raudenbush, et al., (2007). A
stratified randomized design is essentially a completely randomized design with the exception
of two or more stratifying factors to increase the chance of a well balanced intervention groups
(Lewsey, 2004). Some examples of stratification factors that can be used are cluster size,
socioeconomic status, geographic location, or any other categorical factor that may be
believed to influence groupings. Cluster randomized trials using stratification are chosen at the
designs stage. If the strength of the stratification factors is believed to be high and does not
seem to have an adequate number of clusters to achieve balance by not stratifying, then this
design can increase power compared to that of its completely randomized counterpart
(Lewsey, 2004).
In a simulation study conducted by Lewsey (2004), he determined that stratifying by
cluster size did indeed increase the power cluster randomized trials. The increase in power
was most beneficial when the amount of clusters in the study were small. As the number of
clusters was increased, the samples had a more appropriate chance to balance each other
out. A general rule stated by Klar and Donner (2004), is that stratification should only be used
when there is evidence that the strata represent important factors and when there are few
individuals in the trial. Stratification is also more beneficial in cases where there are two or
22
more clusters per strata if there are twenty or more pairs. This will help to determine whether
the matches have distinct rather than similar attributes. However, if given the choice between
stratification and matching, Hayes et al. (2000) stated stratification is a more desirable option
than that of matching.
Matching is a form of pre-randomization blocking in which the blocks consist of two
units that are believed to be equivalent on all variables with exception of the intervention
(Raudenbush, et al., 2007). Matching is another alternative to attempt to increase the power of
cluster randomized designs. Prior to treatment, experimental units can be separated into
subclasses called blocks in which these blocks are perceived to be similar (Raudenbush, et al.,
2007). Matching is not typically seen in trials randomizing individual participants. However
they are the design choice of many community intervention trials because of their perceived
ability to match groups on similar characteristics (Campbell, et al., 2007).
Due to the inability to often obtain adequate sample sizes to individually randomize
these trials, effectively matching helps to reduce the probability of creating groups that are
substantially different in important baseline characteristics. However if matching is not correctly
done, matching can cause more harm than good. (Campbell et al., 2007).
According to
Raudenbush et al. (2007), matching will enhance statistical power when groups are well
matched and those characteristics strongly predict outcomes. The key factor to consider is the
factor of variation that lies between groups which can be indexed by the ICC. If the ICC is
large, then matching will be beneficial. However, if it is small, matching will not be effective and
could possible hurt the study due to loss of degrees of freedom.
The objective in matching is to select a matching variable that is highly correlated with
the outcome measure (Hayes et al., 2000). Matching is also problematic because it is not
23
possible to distinguish the between cluster variance from the treatment effect heterogeneity.
Because of this, the researcher cannot estimate separately the conventional intraclass cluster
correlation and the variance of the estimated treatment effect between cluster members
Campbell (2000). Matching does have certain advantages over using covariates. It does not
require linear associations as the use of covariates do, and does provide more flexibility in
design (Raudenbush, et al. (2007). Klar and Donner, (1997) were critical in their findings
regarding matching. They stated that small samples are likely to achieve effective matching but
large samples also have drawbacks when prior information is limited or matches that are close
cannot be sought. For these reasons, stratified designs rather than matching designs are
recommended.
The third method to attempt to increase power in cluster designs is the use of
covariates. Covariates are characteristics that are strong predictors of the outcome and are
built into the study. Prior to treatment, experimental units can be separated into subclasses
called blocks which are perceived to be similar (Raudenbush, et al., 2007). The covariate is
added as part of the linear association along with other predicators. By adding so-called
extraneous variables into the equation, the researcher can help to limit residuals factor thus
greatly reducing the number of units needed to achieve a given power.
Covariates may be inexpensive to acquire and to use and can greatly increase power
depending on the ICC (Raudenbush, et al., 2007). Much like matching, if the ICC is large then
the use of a group level covariate can be strong. However, like matching, if the ICC is small the
impact of the covariate will also be small. The use of covariates also shares some of the other
faults that matching does. For example, Stevens, (1992) noted that even using multiple
covariates will not necessarily equate intact groups and that the variables used to equate
24
groups may cause a greater difference on others. Sawilowsky (2007) demonstrated this via a
Monte Carlo simulation where a covariate adjustment of reading levels was incorrectly made
on the basis of a pre-test and as a result incorporated to the design. When the post-test, which
had less emphasis on reading ability, was given and the covariate adjustment was accounted
for the study led to the false conclusion of the effectiveness of the treatment variable.
In the previous sections, the difference between individualized randomized trials and
cluster randomized trials were discussed as well as the different types of cluster designs. It is
well known that ignoring these differences can affect the validity of cluster randomized trials,
yet it is still done. In the following discussion, the validity of cluster randomized trials will be
examined.
Randomization and Validity: Randomization allows rationalizing that independent
participants or groups of participants are, at least in theory, equal. It allows us to address the
implications of internal and external validity that can cause a study to be flawed. The issues in
validity were described by Campbell and Stanley (1963). They addressed some of the many
forms of validity problems a study can have. Any one validity issue can make the experiment
flawed.
Validity problems in cluster randomization have been improving but still exist. Eldrige et
al. (2008) found that 25% of their samples of cluster designs were potentially biased due to
recruitment and identification of patients, and approximately 50% of the participants used
blinding either by allocation or assessors. Approximately 50% of the studies adequately
assessed generalizability of clusters and external validity seemed to be poorly addressed in
many of the trials.
25
To limit the issues in validity, it is important that sample are drawn correctly. There are
both differences and similarities in how to draw standard independent samples and cluster
samples. Cochran (1977) stated the principal steps in any sample survey as followed:
1. State objective of survey
2. Identify the population to be sampled
3. Identify the data that are needed to be collected
4. Determine the degree of precision desired
5. Determine the method of measurement
6. Determining the frame or sampling units that will be used
7. Initiate pretest on small area to identify weaknesses in survey
8. Organize field work and training effectively
9. Summarize and analyze data
10. Evaluate information gained for future surveys
The purpose of sampling theory is to make samples more efficient and random
(Cochran, 1977). The steps above are basic steps in conducting survey sampling. These steps
are not independent of cluster sampling. However there are some subtle differences. Cluster
Sampling is not as accurate as simple random sampling however its use is appropriate when
the lower cost per element more than makes up for its disadvantages. This scenario often
occurs in large widespread samples (Kish, 1965).
When it is appropriate to randomize by clusters, there are certain procedures that need
to be followed. The parameters of the study must be defined. The researchers randomization
scheme can be strong. However, if they are pulling from an improper population than the
results may be misleading. If the sampling measures are sound, then they should mirror the
26
overall population as a whole (Upton, 1978). It is important to pay careful attention to time,
location, and any other variables that may cause the sample to not be representative of the
population. This is also true for subsampling. When subsampling, the individual participants
need to be representative of the clusters they are drawn from. According to Sudman (1976),
clusters must be well defined and every element must belong to one and only one cluster, the
number of population elements must be known or have a reasonable estimate, clusters must
be small enough to make clustering with while, and clusters should be chosen to limit the
sampling error caused by clustering.
Once clusters have been carefully designed and organized, the process of selecting
them can begin. Hansen, Hurwitz, and Madow (1953) proposed the following procedure for
selecting individuals from clusters:
1. Number primary selection units (psu’s) accurately
2. Select at random a page from random numbers table and scan down until number
fits within your sample, continuing scanning random numbers from start point until
the sample is satisfied
3. Divide each sampled block into four compact segments with roughly equal numbers
of elementary units
4. Number the segments in each block and take a random number from that selection
5. Collect desired information from the selection
Researchers prefer to use clusters of equal size. This is obviously not always possible
to due to the nature of the experiment. Unequal clusters cause additional complexities in an
experiment. According to Kish (1965), once unequal clusters are chosen sample size is no
27
longer a fixed and thus becomes a random variable. The ratio mean is not an unbiased
estimate of the population mean, and practical variances are not unbiased estimates of true
variances. To limit the problems caused by unequal clusters, Kish (1965) suggested selecting
large numbers of clusters, stratifying clusters according to size, defining and combining natural
clusters, and subsampling with probabilities proportionate to size (pps).
To subsample with probabilities proportionate to size, the researcher is assigning to
each cluster a sequence of random numbers equal to its size and the sampling systematically
(Sudman, 1976). A size measure is assigned to each cluster and then cumulatively summed
over all clusters. A sampling interval is then determined by taking the cumulatively summed
total and dividing it by the number of clusters desired. A random start is then determined and
the sample is drawn systematically from the start point (Sudman, 1976). However, when
sampling pps. there are certain clusters that will be automatically included and thus not
randomly selected.
Even if proper randomization procedures are believed to be followed, there can be
oversights. In cluster trials, the risk for potential bias can occur both in the selection of the
cluster level and in the selection of the independent participants subjects (Pufer, et al., 2003).
One of the ways to reduce possible bias is through blinding. Blinding is an attempt to keep trial
participants, investigators, and/or assessors unaware of the interventions being assigned. The
blinding can occur for one group of the study or for all. When blinding is being used, it is
important for the researchers to clearly define what form of blinding is being used and to what
extent (Schulz, & Grimes, 2002).
In a study conducted by Johnson et al. (2008), it was
discovered that there were biases in studies conducted in conflict mortality. The researcher
had used random main streets as starting points and then proceeded to conduct their sample.
28
Johnson et al. (2008) determined that the results had been biased due to the fact that main
streets are more highly trafficked and more likely to have casualties than random
neighborhood streets. The researchers had properly randomized. However, they did not clearly
consider a major contributing variable.
In another case called the Edinburgh Trial which involved Breast Cancer Screening
(Alexander, et al., 1989), there was a bias involving socioeconomic status that was not
accounted for. In the Edinburgh Trial, participants were involved in a study to determine if
breast cancer screening reduced mortality. It was discovered that mortality rate, regardless of
intervention was higher for those of lower socioeconomic status. In this case the researchers
may have been better served to account for socioeconomic status by stratification in the
analysis stage (Alexander, et al., 1989). Once again, this is an issue that could have been
addressed in the planning stages of the experiment. To increase statistical power and
decrease bias, researchers need to consider possible covariates (Arceneaux, 2005). The
examples listed above are common validity issues that can occur when not considering all
possible variables. They have been discovered throughout the years when conducting cluster
designs. To better understand the progression and evolution of the cluster randomized design,
a brief history will be discussed.
History of cluster designs: Randomization by cluster has been slower to develop
compared to random assignment by individuals due to the added design and analysis
requirements that cluster sampling entails. Donner and Klar (2000) stated that initial studies of
cluster randomization can be traced back to a 1648 study done by Van Helmont. In this study,
participants were assigned in lots to either the experimental group which received the
29
treatment of bloodletting or to the control group. This, however, cannot be defined as true
randomization due to the fact that the study was not replicated.
The statistical implication of cluster randomization was noted by Lindquist, (1940). He
stated when employing cluster sampling in educational research that there is the possibility of
a large systematic difference from school to school which could account for variability.
Lindquist also stated that clustering in these statistical anomalies can be accounted for by
testing the cluster means with standard statistical methods. Glass and Hopkins, (1996), argued
that standard statistical methods could not be used in cluster samples. They stated that a
common flaw in educational research is to select schools or classes at random and then
students from those schools or classes. This violates the assumption of interdependence and
can’t be considered a true random sample and the proper method of analysis for these types
of studies (cluster sampling) often eludes the most seasoned researchers.
Hansen and Hurwitz (1942) addressed the statistical anomalies of cluster sampling
compared to individual sampling stating that the increase in variance due to clustering can be
quite substantial even if the correlations among clusters is small. They stated that with
increased cluster size the intra-cluster correlation will drop but not at rate that is slower than
linear (Donner & Klar, 2000.)
According to Donner and Klar (2000), many studies prior to 1978 did not properly
account for clustering either in the design or the sampling with exception of the following;
Comstock (1962), Ferbee et al. (1963), and Horwitz and Magnus (1974). Many of the ideas to
account for clustering effects were adapted by Pollack (1966), who studied the organization
and evaluation of trials of prophylactic agents for the control of communicable diseases. He
noted that randomizing clusters rather than individuals are less likely to be balanced for
30
extraneous variables. However, these trials do provide administrative convenience, reduce the
risk of treatment contamination, and increase the likelihood of subject participation (Donner
and Klar, 2000). The Taichung experiment (Berelson &Freedman 1964) is also referenced as a
noteworthy cluster design acknowledging the authors took detailed approaches to both
randomize and analyze the clustered data (Donner & Klar, 2000).
Cornfield (1978) discussed the statistical implications for clustering. Cornfield discussed
the statistical efficacy of clustering and concluded that sample size must be inflated to account
for the effects of randomizing clusters rather than individuals. He summarized his conclusions
by stating that randomizing by cluster should not be discouraged, but the study will yield less
information than if it was conducted with individuals as the unit of analysis. This limitation can
be accounted for by raising the sample size included in the study. The analysis performed in
cluster studies must account clustering or the results could prove to be misleading.
According to Donner and Klar (2000), after 1978, researchers began to understand the
complexities involved in clustering but had few resources to use to develop appropriate cluster
designs. This problem began to be addressed by (Gilliam et al., 1980; Donner et al., 1981).
However, many authors continued to publish articles that did not address these issues.
Donner (1990) examined studies that employed clustering in their design He evaluated
these on the following factors: justification for employing cluster randomization, between
cluster variation accounted for in sample size and/or power calculations, between cluster
variation accounted for in the analysis, baseline reporting of prognostic factors consideration of
prognostic factors in the analysis, and the reports of participants loss due to follow-up. In his
evaluation of sixteen studies, he found that only four of them gave reason for justification of
clustering. Only three of the sixteen designs accounted for between cluster variation in the
31
sample size or power of the design. Half of the designs accounted for the between cluster
variation in the analysis of the studies. Thirteen of the sixteen studies accounted for baseline
prognostic factors and thirteen of the sixteen considered those prognostic factors in the
analysis. Fourteen of the sixteen studies included the loss of participants in the analysis. It
should also be noted that half of the trial reviewed used traditional statistical methods to
interpret the results. The Type-I error associated with these procedures is likely to be greatly
increased as a result of clustering (Donner, 1990). The studies that have not taken the
necessary steps regarding clustering cannot be deemed valid and thus their results, although
they may be significant, can be misleading.
A similar review was published by (Simpson et. al, 1995). They examined primary
prevention trials through the years of 1990-1993. They evaluated 21 articles during this time
period and determined that only 4 (19%) accounted for clustering in the sample size and power
analysis of their design. They also discovered that only 12 (57%) accounted for clustering in
their statistical analysis. The methodology and criteria for conducting cluster randomized trials
are clearly available however many still seem to either ignore them or are still unaware of
them. This oversight or neglecting of proper experimental design affects the validity of the
results of experiments that are conducted. It is the intent of this study to determine to what
extent the validity has been affected.
32
CHAPTER III
METHODOLOGY
In order for a study to be valid, it is essential to randomize selection at both levels of a
cluster design. There are often times when the cluster level of a two stage cluster design will
be purposely selected. When this occurs, the study violates the principle for randomization. It
is the purpose of this study to demonstrate the extent of the effects of this violation. To
accomplish this, the use of Monte Carlo Methods will be incorporated. Monte Carlo is repeated
sampling from a probability distribution to determine the long run average of a parameter that
the researcher intends to study (Sawilowsky & Fahoome, 2003). These methods will be used
to create a simulation which is the representation with a model to a real life characteristic
(Sawilowsky & Fahoome, 2003). For the purpose of this study, the simulation will be used to
represent students’ achievement scores. The simulation will be generated on a WINTEL
compatible personal computer using Compaq 6.6c Visual Fortran. This simulation will answer
the following research question: To what extent is a cluster sample biased when the
researcher fails to randomly select clusters in the initial stage of a two stage cluster sample?
This simulation will create data that will be used for a two stage cluster design. The
design will have a population of 100 clusters of equal size that each have 100 individual scores
randomly assigned to them from a normal or Gaussian data set. Throughout the 20th century, it
was believed that the Gaussian curve was a good model to demonstrate the likely outcomes of
educational or psychological testing (Sawilowsky & Fahoome, 2003). In this model, the
majority of data is centered around the mean with the µ=0 and the =1 Many educational tests
are norm referenced meaning that they compare the individual being tested to the general
population. In this form of testing, the overall results will fall along the normal curve by
33
definition. There are often times when results from educational or psychological testing do not
fall within the normal curve. However, for the purposes of this study it will show results under
the best possible circumstances.
After the scores have been assigned, each cluster will have their mean computed.
These clusters will then be rank ordered from highest to lowest to determine purposeful
selection later in the study. The initial number of clusters will be set to two. Individual scores
from each cluster will be randomly chosen, representing the second stage. After the lower and
upper limits of the confidence intervals have been obtained, they will be stored. The simulation
will be repeated for a total of 10,000 replications. At the conclusion, an overall mean will be
computed for the upper and lower limit of the confidence intervals and recorded.
The simulation will then be repeated, this time using the two clusters with the greatest
means. This process will be repeated 10,000 times and the overall mean of these confidence
intervals will be computed and stored. The upper limit and lower limit of the confidence interval
of the randomly selected group and the purposefully selected group will be compared and the
difference will be computed. The width of the confidence interval for the random selection will
also be computed and compared to the width of the confidence interval for the purposeful
selection using a proportion. This process will be completed 19 times, increasing the number
of clusters by one (i. e., 2 clusters, 3 clusters, 4 clusters, etc.) until 20 of 100 random and
purposeful clusters are chosen and compared.
In the majority of educational testing, the researcher must account for extraneous
variables, meaning outside influences that the researcher could not be accounted for or tested.
One of the advantages of using a Monte Carlo design is that the study operates in a controlled
environment. Therefore, there are not any extraneous variables that can influence the study.
34
As long as the distribution is representative of the population that is tested, the results will not
be skewed by an outside influence. The simulation uses the data that are provided and does
exactly what the researcher programs it to. Due to the control the researcher has in setting up
the simulation, it is not possible for extraneous variables to affect the study.
35
CHAPTER IV
RESULTS
Ceteras paribus, equal cluster sampling already lacks the power available in a simple
random sampling. The inefficiency can be quantified by using the following formula (Sudman,
1976):
A8.
This will give rho (ρ) for a cluster sample compared to a simple random sample. The
magnitude of ρ can be computed in this manner or referenced from previous studies (e.g.,
Sudman, 1976). After the ρ can be determined, the ratio of sampling error between cluster
sampling and simple random sampling can be computed as followed:
A9.
.
For the purposes of this study, a completed rho chart (see appendix B) has been
compiled. A section of that chart appears below (figure 4.1). Using the number of participants
in each cluster, and the approximate rho value, the sampling error of a simple random sample
to that of cluster sample can be determined. For example, a cluster containing 10 participants
and a ρ value of .2 would have a 1.18 sampling error compared to that of the same size simple
random sample. Clearly, using a cluster sample compared with a simple random sample
affects the integrity of the study, which may only be acceptable if considerations of cost saving
is paramount.
36
Table 4.1 Rho Table
N_bar rho=.01 rho=.02 rho=.03 rho=.04 rho=.05
1
1
1
1
1
1
2
1.01
1.02
1.03
1.04
1.05
3
1.02
1.04
1.06
1.08
1.1
4
1.03
1.06
1.09
1.12
1.15
5
1.04
1.08
1.12
1.16
1.2
6
1.05
1.1
1.15
1.2
1.25
7
1.06
1.12
1.18
1.24
1.3
8
1.07
1.14
1.21
1.28
1.35
9
1.08
1.16
1.24
1.32
1.4
10
1.09
1.18
1.27
1.36
1.45
In order to determine the results of the study, the confidence interval was analyzed for
both random and purposeful clusters at each number of clusters. Graphs and tables were
developed for the lower limit and the upper limit of each confidence interval. The width of
random and purposeful confidence intervals were also compared using a proportion. First, the
lower limit of the confidence intervals for both random and purposeful selection will be
analyzed. Figure 4.1 shows a graphical comparison between the lower limit of the confidence
intervals for random cluster selection versus a lower limit of the confidence intervals for a
purposeful cluster selection. Below this graph is a table (Table 4.2) with the actual results. The
first observation to note is that the lower limit of the confidence intervals for the random
selection of clusters remains consistent independent of the number of clusters chosen. The
same cannot be said for the lower limit of the confidence intervals for the purposeful selection
of clusters. The lower limit of the confidence intervals for the purposeful selection of clusters
shows variability dependent on the number of clusters being chosen. The largest discrepancy
between the purposeful lower limit of the confidence intervals versus the random lower limit of
37
the confidence intervals occurred during the initial selection size of two clusters. In this stage,
the difference between the lower limit of the confidence interval for purposeful selection
compared to the lower limit of the confidence interval for random selection was 1.8 (-1.948515
to -.15). The next selection size of three clusters showed an improvement in the lower limit of
the confidence intervals between purposeful and random clusters to a difference of 1.15 (1.29901 to -0.1502589). The difference in the lower limit of the confidence intervals between
purposeful and random selection of clusters continues to decrease until the last simulation is
compiled using 20 clusters. At this stage, the difference between the purposeful lower limit of
the confidence interval and the random lower limit of the confidence interval was .02
(0.1948515 to -0.1774427).
0
-0.5
-1
Lower Purposeful Selection
Lower Random Selection
-1.5
-2
-2.5
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Figure 4.1: Random vs. Purposeful Lower Limit Cluster Results
38
Table 4.2: Random vs. Purposeful Lower Limit Cluster Results
Number of
Clusters
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Lower Purposeful Selection Lower Random Selection
-1.9485
-1.2990
-0.9743
-0.7794
-0.6495
-0.5567
-0.4871
-0.4330
-0.3897
-0.3543
-0.3248
-0.2998
-0.2784
-0.2598
-0.2436
-0.2292
-0.2165
-0.2051
-0.1949
-0.1534
-0.1503
-0.1549
-0.2029
-0.1606
-0.1368
-0.1362
-0.1348
-0.1487
-0.1401
-0.1413
-0.1581
-0.1495
-0.1603
-0.1587
-0.1636
-0.1740
-0.1805
-0.1774
Difference
1.7951
1.1488
0.8194
0.5765
0.4889
0.4199
0.3509
0.2982
0.2410
0.2142
0.1834
0.1417
0.1289
0.0995
0.0849
0.0657
0.0425
0.0246
0.0174
Similar results were compiled for use of the upper limit of the confidence intervals.
Figure 4.2 shows a graphical comparison between the upper limit of the confidence intervals
for purposeful cluster selection versus random cluster selection. Below this graph is a table
(Table 4.3) with the actual results.
Once again, the first observation is that the overall random selection of the upper limit of
the confidence interval clusters remained consistent independent of the number of clusters.
The largest discrepancy between the upper limit of the confidence interval for purposeful
versus random cluster selection occurred during the initial selection size of two clusters. In this
39
stage, the difference between the upper limit of the confidence interval between purposeful
and random selection was 1.85 (2.10664 to 0.256395 ). The next purposeful selection size of
three clusters showed an improvement to a difference of 1.16 (1.404427 to 0.24761) between
the upper limit of the confidence interval of purposeful versus random selection. The difference
continues to decrease until the last simulation is compiled using 20 clusters. At this stage, the
difference between the upper limit of the confidence intervals between purposeful versus
random cluster selection was -.01 (0.210664 to 0.218487).
2.5
2
1.5
Upper Purposeful Selection
Upper Random Selection
1
0.5
0
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Figure 4.2: Random vs. Purposeful Upper Limit Cluster Results
40
Table 4.3: Random vs. Purposeful Upper Limit Cluster Results
Number of Clusters
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Upper Purposeful
Selection
2.1066
1.4044
1.0533
0.8427
0.7022
0.6019
0.5267
0.4681
0.4213
0.3830
0.3511
0.3241
0.3009
0.2809
0.2633
0.2478
0.2341
0.2218
0.2107
Upper Random
Selection
0.2564
0.2476
0.2409
0.1949
0.2384
0.2584
0.2603
0.2641
0.2487
0.2592
0.2564
0.2388
0.2471
0.2349
0.2370
0.2291
0.2203
0.2172
0.2185
Difference
1.8502
1.1568
0.8124
0.6478
0.4638
0.3435
0.2663
0.2041
0.1726
0.1239
0.0947
0.0853
0.0539
0.0460
0.0263
0.0188
0.0138
0.0046
-0.0078
It is evident that there is a difference between purposeful and random selection in both the
lower limit and upper limit of the confidence intervals. This difference makes the width of the
overall purposeful confidence interval greater than the width of the random confidence interval.
The difference in the width of the confidence intervals between the purposeful and random
selection is dependent on the number of clusters chosen. The extent of that width was
examined in graph and table below (Figure 4.3 and table 4.4).
41
Proportion of purposeful cluster over
random cluster
12
10
8
Proportion of purposeful
cluster over random
cluster
6
4
2
0
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Figure 4.3: Proportion of purposeful cluster over random cluster
Table 4.4: Proportion of purposeful cluster over random cluster
Number of Clusters
Proportion of purposeful cluster over random cluster
2
3
4
5
6
7
8
9
10
11
9.8952
6.7948
5.1229
4.0782
3.3882
2.9314
2.5566
2.2593
2.0407
1.8468
42
12
13
14
15
16
17
18
19
20
1.6994
1.5719
1.4608
1.3680
1.2811
1.2150
1.1429
1.0733
1.0242
It is evident that the proportion of the width of the confidence intervals between
purposeful clusters selection compared to random clusters is also different. It is similar to the
previous two graphs of lower limit and upper limit of the confidence intervals. As the number of
clusters increase, the ratio between the purposeful and random samples decreases. For
example, the width of the confidence interval for purposeful cluster selection using two clusters
was 9.9 times greater than the width of the random cluster selection using two clusters. The
width of the confidence interval of a purposeful cluster selection using three clusters is 6.8
times greater than its random cluster selection counterpart. The difference in the overall width
of confidence intervals continue to decrease as the number of clusters increases until its last
simulation at 20 clusters. Using 20 clusters, the width of the purposeful selection is 1.02 times
greater than the random selection.
According to these results, it is obvious that purposeful selection shows a greater width
in confidence intervals at each number of clusters compared to random selection. This ratio in
the width between purposeful and random selection of confidence intervals decreases as the
amount of clusters selected increases showing that the number of clusters has an effect on the
results of the simulation.
43
Chapter 5
Conclusions and Recommendations
The most precise sampling approach is a simple random sample. However, in cases
where populations are designated in specific areas or clusters, cluster sampling can be done
to save on time or cost. Conducting a cluster sample will cause a decrease in efficiency of the
results. When the researcher compounds this with improper or purposeful selection of clusters,
the results of the study can become greatly skewed, even under the most optimal conditions.
This study determined how much those results would be skewed.
In chapter II and IV, it was discussed the effect that clustering can have on a study.
Using the ρ (rho) chart (Appendix B), it is easily determined via mathematical methods that the
effect clustering can have on a study is dependent on the ρ value and the number of
participants that are in a cluster, at least under the assumption of normality. This chart signifies
that even under normal conditions with randomization cluster sampling is still not as
statistically efficient as simple random sampling. When the researcher adds to the flaws of
cluster sampling by using purposeful sampling, it increases the likelihood that a study becomes
more flawed.
Using normal data, it was determined that the ratio of the confidence interval for
purposeful selection of clusters was almost ten times greater than the confidence interval for
random cluster selection using two clusters. Given that this is a small number of clusters for a
study, it is still significant to note that a study using these parameters would yield a confidence
interval ten times wider than of its random counterpart. A study, already compromised due to
the effects of clustering, will be worse using purposeful cluster sampling. These results would
44
be at best case scenario useless and at worst case detrimental to those who acted on the
results.
A study consisting of two clusters is unlikely, so consider the next three results. The
simulation containing a number of clusters equal to three had a confidence interval for
purposeful cluster selection 6.79 times wider than that of its random counterpart. The
purposeful cluster selection containing four clusters had a confidence interval 5.12 times wider
than its random counterpart, and the simulation study for purposeful selection containing five
clusters is 4.1 times wider than that of its random counterpart. There are two observations that
can be drawn from this. The first is that in the early stage of cluster selections (number of
clusters equal to 2, 3, 4, and 5) the width of the confidence interval between purposeful and
random selection is at best four times larger than it should be. Using results from any of these
studies would be useless or incredibly misleading. The second observation to note is that
increase in the number of clusters has a significant impact on the results in the early stages of
cluster selection. In the first four simulations, the width of the confidence interval decreased by
a minimum of 1.1 times between number of clusters equal to four and five and a maximum of
3.1 times between number of clusters equal to two and three. These differences in confidence
intervals between clusters begin to narrow in the next four simulations.
In the next four simulations, (number of clusters equal to 6, 7, 8, and 9) the width of the
confidence intervals between individual number of clusters (i.e number of clusters equal to 6 or
7) differ from 3.38 to 2.04 times greater than the corresponding cluster sample. As one can
see, the difference between confidence intervals during these four cluster sizes is considerably
less than the first four simulations that were conducted. The largest difference between any
two corresponding number of clusters in these four simulations was 0.69 and the smallest was
45
0.3. These results suggest the width in confidence intervals between purposeful and random
selection beginning to narrow.
This trend continues in the final eleven selections as the ratio of confidence intervals for
purposeful selection of clusters is as high as 2.04 times (number of clusters equal to 10) and
as low as 1.02 times (number of clusters equal to 20) its random counterpart. In the final 11
cluster simulations, the results truly begin to narrow from a difference in ratio of 0.19 (number
of clusters equal to 11 and 12) to a difference in ratio of 0.05 (number of clusters equal to 19
and 20). Even at number of clusters equal to 20, which is the best case scenario according to
this study, the purposeful cluster selection yielded a ratio of 1.02 times greater than that of a
random sample. According to this result, the width of the purposeful confidence interval is
increased by 2% compared to that of a random sample. 2% may seem minor however put in
the perspective of medical research it could be critical.
The results from the random cluster selections are also important to review. If one takes
a look at the random results according to the study, one would note that on average there is a
width of 0.4 between the upper and lower limit confidence intervals. There are two conclusions
that can be derived from this. The first is the number of clusters has a minimal effect on the
results of the random selection and the second is true randomization works. If the researcher
randomizes in the selection and assignment of a study, the results will be more accurate than
that of purposeful selection. The second conclusion is important to note for the researcher who
insists on including specific clusters in a study to make that study “relevant.” Using these
results, they can conclude that regardless of the number of clusters and the specific clusters
that are chosen there will be a confidence interval of approximately 0.40 for random cluster
46
selection. This proves that if the desired clusters are not in a study but randomization is done
correctly that the results will be consistent over time.
It is also important to note that these results are using data from a normal distribution.
Participants in real life will not be as predictable as this. By using data from a normal
(Gaussian) distribution, the researcher knows that half of the scores will fall below the mean
and half will fall above the mean in the shape of a perfect bell curve. This is not the case in
applied situations because participants do not fit into a bell curve model (Micceri, 1988).
Participants in applied situations are influenced by a variety of factors. This is especially true in
cluster sampling due to the clustering effect.
The method for cluster sampling is clearly defined by past research yet the need to
include important clusters in the studies clouds the judgment of some researchers. If it is
crucial to include certain participants in the study, then there are two possibilities. The first is to
increase the likelihood of those clusters being chosen by increasing the number of clusters
selected. However, this is not a failsafe method and may defeat the rationale for cluster
sampling which is to save time and money. The number of clusters could be increased but
they still may not return the desired clusters. Furthermore, if it doesn’t and the researcher
“randomly” selects clusters again until they get the desired results, they are essentially doing
an ex post facto study which is not valid form of study.
The second method the researcher could do is to change the design of the study. For
example, if there is the necessity to include the major counties of a state in a study then make
the study’s parameters that of those counties. The study would not be pertinent to the entire
state, even though it would include the desired population. The best option is to select the
desired population and conduct the study completely following the rules of randomization at all
47
levels. If the selection does not include the desired clusters, the researcher may or may not be
discouraged but at least it would not lead to as Cornfield (1978, pp.101) stated “an exercise in
self deception.”
48
APPENDIX A
FORMULAS
A1. Interclass Correlation Coefficient
A2. Variance Inflation Factor
D  1  (m  1) p
A3. Sample size formula for cluster trials
A4. Adjusted effect size formula for cluster trials
A5. Within mean difference effect size I
A6. Within mean difference effect size II
49
A7. Within mean difference effect size III
A8. Rho formula for cluster trials
A9. Proportion of error between simple random and cluster sample
.
50
APPENDIX B
RHO CHART
N_bar
rho=.01
rho=.02
rho=.03
rho=.04
rho=.05
rho=.06
rho=.07
rho=.08
1
2
3
4
5
6
7
1
1.01
1.02
1.03
1.04
1.05
1.06
1
1.02
1.04
1.06
1.08
1.1
1.12
1
1.03
1.06
1.09
1.12
1.15
1.18
1
1.04
1.08
1.12
1.16
1.2
1.24
1
1.05
1.1
1.15
1.2
1.25
1.3
1
1.06
1.12
1.18
1.24
1.3
1.36
1
1.07
1.14
1.21
1.28
1.35
1.42
1
1.08
1.16
1.24
1.32
1.4
1.48
8
9
10
11
12
13
14
1.07
1.08
1.09
1.1
1.11
1.12
1.13
1.14
1.16
1.18
1.2
1.22
1.24
1.26
1.21
1.24
1.27
1.3
1.33
1.36
1.39
1.28
1.32
1.36
1.4
1.44
1.48
1.52
1.35
1.4
1.45
1.5
1.55
1.6
1.65
1.42
1.48
1.54
1.6
1.66
1.72
1.78
1.49
1.56
1.63
1.7
1.77
1.84
1.91
1.56
1.64
1.72
1.8
1.88
1.96
2.04
15
16
17
18
19
20
21
1.14
1.15
1.16
1.17
1.18
1.19
1.2
1.28
1.3
1.32
1.34
1.36
1.38
1.4
1.42
1.45
1.48
1.51
1.54
1.57
1.6
1.56
1.6
1.64
1.68
1.72
1.76
1.8
1.7
1.75
1.8
1.85
1.9
1.95
2
1.84
1.9
1.96
2.02
2.08
2.14
2.2
1.98
2.05
2.12
2.19
2.26
2.33
2.4
2.12
2.2
2.28
2.36
2.44
2.52
2.6
22
23
24
25
26
27
28
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.42
1.44
1.46
1.48
1.5
1.52
1.54
1.63
1.66
1.69
1.72
1.75
1.78
1.81
1.84
1.88
1.92
1.96
2
2.04
2.08
2.05
2.1
2.15
2.2
2.25
2.3
2.35
2.26
2.32
2.38
2.44
2.5
2.56
2.62
2.47
2.54
2.61
2.68
2.75
2.82
2.89
2.68
2.76
2.84
2.92
3
3.08
3.16
29
30
31
32
33
34
35
1.28
1.29
1.3
1.31
1.32
1.33
1.34
1.56
1.58
1.6
1.62
1.64
1.66
1.68
1.84
1.87
1.9
1.93
1.96
1.99
2.02
2.12
2.16
2.2
2.24
2.28
2.32
2.36
2.4
2.45
2.5
2.55
2.6
2.65
2.7
2.68
2.74
2.8
2.86
2.92
2.98
3.04
2.96
3.03
3.1
3.17
3.24
3.31
3.38
3.24
3.32
3.4
3.48
3.56
3.64
3.72
36
1.35
1.7
2.05
2.4
2.75
3.1
3.45
3.8
37
1.36
1.72
2.08
2.44
2.8
3.16
3.52
3.88
51
N_bar
rho=.01
rho=.02
rho=.03
rho=.04
rho=.05
rho=.06
rho=.07
rho=.08
38
39
40
1.37
1.38
1.39
1.74
1.76
1.78
2.11
2.14
2.17
2.48
2.52
2.56
2.85
2.9
2.95
3.22
3.28
3.34
3.59
3.66
3.73
3.96
4.04
4.12
41
42
43
44
45
46
47
1.4
1.41
1.42
1.43
1.44
1.45
1.46
1.8
1.82
1.84
1.86
1.88
1.9
1.92
2.2
2.23
2.26
2.29
2.32
2.35
2.38
2.6
2.64
2.68
2.72
2.76
2.8
2.84
3
3.05
3.1
3.15
3.2
3.25
3.3
3.4
3.46
3.52
3.58
3.64
3.7
3.76
3.8
3.87
3.94
4.01
4.08
4.15
4.22
4.2
4.28
4.36
4.44
4.52
4.6
4.68
48
49
50
51
52
53
54
1.47
1.48
1.49
1.5
1.51
1.52
1.53
1.94
1.96
1.98
2
2.02
2.04
2.06
2.41
2.44
2.47
2.5
2.53
2.56
2.59
2.88
2.92
2.96
3
3.04
3.08
3.12
3.35
3.4
3.45
3.5
3.55
3.6
3.65
3.82
3.88
3.94
4
4.06
4.12
4.18
4.29
4.36
4.43
4.5
4.57
4.64
4.71
4.76
4.84
4.92
5
5.08
5.16
5.24
55
56
57
58
59
60
61
1.54
1.55
1.56
1.57
1.58
1.59
1.6
2.08
2.1
2.12
2.14
2.16
2.18
2.2
2.62
2.65
2.68
2.71
2.74
2.77
2.8
3.16
3.2
3.24
3.28
3.32
3.36
3.4
3.7
3.75
3.8
3.85
3.9
3.95
4
4.24
4.3
4.36
4.42
4.48
4.54
4.6
4.78
4.85
4.92
4.99
5.06
5.13
5.2
5.32
5.4
5.48
5.56
5.64
5.72
5.8
62
63
64
65
1.61
1.62
1.63
1.64
2.22
2.24
2.26
2.28
2.83
2.86
2.89
2.92
3.44
3.48
3.52
3.56
4.05
4.1
4.15
4.2
4.66
4.72
4.78
4.84
5.27
5.34
5.41
5.48
5.88
5.96
6.04
6.12
66
67
68
1.65
1.66
1.67
2.3
2.32
2.34
2.95
2.98
3.01
3.6
3.64
3.68
4.25
4.3
4.35
4.9
4.96
5.02
5.55
5.62
5.69
6.2
6.28
6.36
69
70
71
72
73
74
75
1.68
1.69
1.7
1.71
1.72
1.73
1.74
2.36
2.38
2.4
2.42
2.44
2.46
2.48
3.04
3.07
3.1
3.13
3.16
3.19
3.22
3.72
3.76
3.8
3.84
3.88
3.92
3.96
4.4
4.45
4.5
4.55
4.6
4.65
4.7
5.08
5.14
5.2
5.26
5.32
5.38
5.44
5.76
5.83
5.9
5.97
6.04
6.11
6.18
6.44
6.52
6.6
6.68
6.76
6.84
6.92
76
77
1.75
1.76
2.5
2.52
3.25
3.28
4
4.04
4.75
4.8
5.5
5.56
6.25
6.32
7
7.08
52
N_bar
rho=.01
rho=.02
rho=.03
rho=.04
rho=.05
rho=.06
rho=.07
rho=.08
78
79
80
1.77
1.78
1.79
2.54
2.56
2.58
3.31
3.34
3.37
4.08
4.12
4.16
4.85
4.9
4.95
5.62
5.68
5.74
6.39
6.46
6.53
7.16
7.24
7.32
81
82
83
84
85
86
87
1.8
1.81
1.82
1.83
1.84
1.85
1.86
2.6
2.62
2.64
2.66
2.68
2.7
2.72
3.4
3.43
3.46
3.49
3.52
3.55
3.58
4.2
4.24
4.28
4.32
4.36
4.4
4.44
5
5.05
5.1
5.15
5.2
5.25
5.3
5.8
5.86
5.92
5.98
6.04
6.1
6.16
6.6
6.67
6.74
6.81
6.88
6.95
7.02
7.4
7.48
7.56
7.64
7.72
7.8
7.88
88
89
90
91
92
93
94
1.87
1.88
1.89
1.9
1.91
1.92
1.93
2.74
2.76
2.78
2.8
2.82
2.84
2.86
3.61
3.64
3.67
3.7
3.73
3.76
3.79
4.48
4.52
4.56
4.6
4.64
4.68
4.72
5.35
5.4
5.45
5.5
5.55
5.6
5.65
6.22
6.28
6.34
6.4
6.46
6.52
6.58
7.09
7.16
7.23
7.3
7.37
7.44
7.51
7.96
8.04
8.12
8.2
8.28
8.36
8.44
95
96
97
98
99
100
1.94
1.95
1.96
1.97
1.98
1.99
2.88
2.9
2.92
2.94
2.96
2.98
3.82
3.85
3.88
3.91
3.94
3.97
4.76
4.8
4.84
4.88
4.92
4.96
5.7
5.75
5.8
5.85
5.9
5.95
6.64
6.7
6.76
6.82
6.88
6.94
7.58
7.65
7.72
7.79
7.86
7.93
8.52
8.6
8.68
8.76
8.84
8.92
53
N_bar
1
2
3
4
rho=.09
1
1.09
1.18
1.27
rho=.1
1
1.1
1.2
1.3
rho=.11
1
1.11
1.22
1.33
rho=0.12
1
1.12
1.24
1.36
rho=0.13
1
1.13
1.26
1.39
rho=0.14
1
1.14
1.28
1.42
5
6
7
8
9
10
11
1.36
1.45
1.54
1.63
1.72
1.81
1.9
1.4
1.5
1.6
1.7
1.8
1.9
2
1.44
1.55
1.66
1.77
1.88
1.99
2.1
1.48
1.6
1.72
1.84
1.96
2.08
2.2
1.52
1.65
1.78
1.91
2.04
2.17
2.3
1.56
1.7
1.84
1.98
2.12
2.26
2.4
12
13
1.99
2.08
2.1
2.2
2.21
2.32
2.32
2.44
2.43
2.56
2.54
2.68
14
15
16
17
18
2.17
2.26
2.35
2.44
2.53
2.3
2.4
2.5
2.6
2.7
2.43
2.54
2.65
2.76
2.87
2.56
2.68
2.8
2.92
3.04
2.69
2.82
2.95
3.08
3.21
2.82
2.96
3.1
3.24
3.38
19
20
21
22
23
24
25
2.62
2.71
2.8
2.89
2.98
3.07
3.16
2.8
2.9
3
3.1
3.2
3.3
3.4
2.98
3.09
3.2
3.31
3.42
3.53
3.64
3.16
3.28
3.4
3.52
3.64
3.76
3.88
3.34
3.47
3.6
3.73
3.86
3.99
4.12
3.52
3.66
3.8
3.94
4.08
4.22
4.36
26
27
28
3.25
3.34
3.43
3.5
3.6
3.7
3.75
3.86
3.97
4
4.12
4.24
4.25
4.38
4.51
4.5
4.64
4.78
29
30
31
32
3.52
3.61
3.7
3.79
3.8
3.9
4
4.1
4.08
4.19
4.3
4.41
4.36
4.48
4.6
4.72
4.64
4.77
4.9
5.03
4.92
5.06
5.2
5.34
33
34
35
36
37
38
39
3.88
3.97
4.06
4.15
4.24
4.33
4.42
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.52
4.63
4.74
4.85
4.96
5.07
5.18
4.84
4.96
5.08
5.2
5.32
5.44
5.56
5.16
5.29
5.42
5.55
5.68
5.81
5.94
5.48
5.62
5.76
5.9
6.04
6.18
6.32
40
4.51
4.9
5.29
5.68
6.07
6.46
54
N_bar
41
42
43
rho=.09
4.6
4.69
4.78
rho=.1
5
5.1
5.2
rho=.11
5.4
5.51
5.62
rho=0.12
5.8
5.92
6.04
rho=0.13
6.2
6.33
6.46
rho=0.14
6.6
6.74
6.88
44
45
46
47
48
49
50
4.87
4.96
5.05
5.14
5.23
5.32
5.41
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.73
5.84
5.95
6.06
6.17
6.28
6.39
6.16
6.28
6.4
6.52
6.64
6.76
6.88
6.59
6.72
6.85
6.98
7.11
7.24
7.37
7.02
7.16
7.3
7.44
7.58
7.72
7.86
51
52
53
54
55
56
57
5.5
5.59
5.68
5.77
5.86
5.95
6.04
6
6.1
6.2
6.3
6.4
6.5
6.6
6.5
6.61
6.72
6.83
6.94
7.05
7.16
7
7.12
7.24
7.36
7.48
7.6
7.72
7.5
7.63
7.76
7.89
8.02
8.15
8.28
8
8.14
8.28
8.42
8.56
8.7
8.84
58
59
60
61
62
63
64
6.13
6.22
6.31
6.4
6.49
6.58
6.67
6.7
6.8
6.9
7
7.1
7.2
7.3
7.27
7.38
7.49
7.6
7.71
7.82
7.93
7.84
7.96
8.08
8.2
8.32
8.44
8.56
8.41
8.54
8.67
8.8
8.93
9.06
9.19
8.98
9.12
9.26
9.4
9.54
9.68
9.82
65
66
67
68
69
70
71
6.76
6.85
6.94
7.03
7.12
7.21
7.3
7.4
7.5
7.6
7.7
7.8
7.9
8
8.04
8.15
8.26
8.37
8.48
8.59
8.7
8.68
8.8
8.92
9.04
9.16
9.28
9.4
9.32
9.45
9.58
9.71
9.84
9.97
10.1
9.96
10.1
10.24
10.38
10.52
10.66
10.8
72
73
74
75
76
77
78
7.39
7.48
7.57
7.66
7.75
7.84
7.93
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.81
8.92
9.03
9.14
9.25
9.36
9.47
9.52
9.64
9.76
9.88
10
10.12
10.24
10.23
10.36
10.49
10.62
10.75
10.88
11.01
10.94
11.08
11.22
11.36
11.5
11.64
11.78
79
8.02
8.8
9.58
10.36
11.14
11.92
80
8.11
8.9
9.69
10.48
11.27
12.06
55
N_bar
rho=.09
rho=.1
rho=.11
rho=0.12
rho=0.13
rho=0.14
81
82
83
84
8.2
8.29
8.38
8.47
9
9.1
9.2
9.3
9.8
9.91
10.02
10.13
10.6
10.72
10.84
10.96
11.4
11.53
11.66
11.79
12.2
12.34
12.48
12.62
85
86
87
88
89
90
91
8.56
8.65
8.74
8.83
8.92
9.01
9.1
9.4
9.5
9.6
9.7
9.8
9.9
10
10.24
10.35
10.46
10.57
10.68
10.79
10.9
11.08
11.2
11.32
11.44
11.56
11.68
11.8
11.92
12.05
12.18
12.31
12.44
12.57
12.7
12.76
12.9
13.04
13.18
13.32
13.46
13.6
92
93
94
95
96
97
98
9.19
9.28
9.37
9.46
9.55
9.64
9.73
10.1
10.2
10.3
10.4
10.5
10.6
10.7
11.01
11.12
11.23
11.34
11.45
11.56
11.67
11.92
12.04
12.16
12.28
12.4
12.52
12.64
12.83
12.96
13.09
13.22
13.35
13.48
13.61
13.74
13.88
14.02
14.16
14.3
14.44
14.58
99
100
9.82
9.91
10.8
10.9
11.78
11.89
12.76
12.88
13.74
13.87
14.72
14.86
56
N_bar
1
2
3
4
rho=0.15
1
1.15
1.3
1.45
rho= .16
1
1.16
1.32
1.48
rho=.17
1
1.17
1.34
1.51
rho=.18
1
1.18
1.36
1.54
rho=.19
1
1.19
1.38
1.57
5
6
7
8
9
10
11
1.6
1.75
1.9
2.05
2.2
2.35
2.5
1.64
1.8
1.96
2.12
2.28
2.44
2.6
1.68
1.85
2.02
2.19
2.36
2.53
2.7
1.72
1.9
2.08
2.26
2.44
2.62
2.8
1.76
1.95
2.14
2.33
2.52
2.71
2.9
12
13
2.65
2.8
2.76
2.92
2.87
3.04
2.98
3.16
3.09
3.28
14
15
16
17
18
2.95
3.1
3.25
3.4
3.55
3.08
3.24
3.4
3.56
3.72
3.21
3.38
3.55
3.72
3.89
3.34
3.52
3.7
3.88
4.06
3.47
3.66
3.85
4.04
4.23
19
20
21
22
23
24
25
3.7
3.85
4
4.15
4.3
4.45
4.6
3.88
4.04
4.2
4.36
4.52
4.68
4.84
4.06
4.23
4.4
4.57
4.74
4.91
5.08
4.24
4.42
4.6
4.78
4.96
5.14
5.32
4.42
4.61
4.8
4.99
5.18
5.37
5.56
26
27
28
4.75
4.9
5.05
5
5.16
5.32
5.25
5.42
5.59
5.5
5.68
5.86
5.75
5.94
6.13
29
30
31
32
5.2
5.35
5.5
5.65
5.48
5.64
5.8
5.96
5.76
5.93
6.1
6.27
6.04
6.22
6.4
6.58
6.32
6.51
6.7
6.89
33
34
35
36
37
38
39
5.8
5.95
6.1
6.25
6.4
6.55
6.7
6.12
6.28
6.44
6.6
6.76
6.92
7.08
6.44
6.61
6.78
6.95
7.12
7.29
7.46
6.76
6.94
7.12
7.3
7.48
7.66
7.84
7.08
7.27
7.46
7.65
7.84
8.03
8.22
40
41
42
6.85
7
7.15
7.24
7.4
7.56
7.63
7.8
7.97
8.02
8.2
8.38
8.41
8.6
8.79
57
N_bar
43
44
45
46
rho=0.15
7.3
7.45
7.6
7.75
rho= .16
7.72
7.88
8.04
8.2
rho=.17
8.14
8.31
8.48
8.65
rho=.18
8.56
8.74
8.92
9.1
rho=.19
8.98
9.17
9.36
9.55
47
48
49
50
51
52
53
7.9
8.05
8.2
8.35
8.5
8.65
8.8
8.36
8.52
8.68
8.84
9
9.16
9.32
8.82
8.99
9.16
9.33
9.5
9.67
9.84
9.28
9.46
9.64
9.82
10
10.18
10.36
9.74
9.93
10.12
10.31
10.5
10.69
10.88
54
55
8.95
9.1
9.48
9.64
10.01
10.18
10.54
10.72
11.07
11.26
56
57
58
59
60
9.25
9.4
9.55
9.7
9.85
9.8
9.96
10.12
10.28
10.44
10.35
10.52
10.69
10.86
11.03
10.9
11.08
11.26
11.44
11.62
11.45
11.64
11.83
12.02
12.21
61
62
63
64
65
66
67
10
10.15
10.3
10.45
10.6
10.75
10.9
10.6
10.76
10.92
11.08
11.24
11.4
11.56
11.2
11.37
11.54
11.71
11.88
12.05
12.22
11.8
11.98
12.16
12.34
12.52
12.7
12.88
12.4
12.59
12.78
12.97
13.16
13.35
13.54
68
69
70
11.05
11.2
11.35
11.72
11.88
12.04
12.39
12.56
12.73
13.06
13.24
13.42
13.73
13.92
14.11
71
72
73
74
11.5
11.65
11.8
11.95
12.2
12.36
12.52
12.68
12.9
13.07
13.24
13.41
13.6
13.78
13.96
14.14
14.3
14.49
14.68
14.87
75
76
77
78
79
80
81
12.1
12.25
12.4
12.55
12.7
12.85
13
12.84
13
13.16
13.32
13.48
13.64
13.8
13.58
13.75
13.92
14.09
14.26
14.43
14.6
14.32
14.5
14.68
14.86
15.04
15.22
15.4
15.06
15.25
15.44
15.63
15.82
16.01
16.2
82
83
84
13.15
13.3
13.45
13.96
14.12
14.28
14.77
14.94
15.11
15.58
15.76
15.94
16.39
16.58
16.77
58
N_bar
85
86
87
88
rho=0.15
13.6
13.75
13.9
14.05
rho= .16
14.44
14.6
14.76
14.92
rho=.17
15.28
15.45
15.62
15.79
rho=.18
16.12
16.3
16.48
16.66
rho=.19
16.96
17.15
17.34
17.53
89
90
91
92
93
94
95
14.2
14.35
14.5
14.65
14.8
14.95
15.1
15.08
15.24
15.4
15.56
15.72
15.88
16.04
15.96
16.13
16.3
16.47
16.64
16.81
16.98
16.84
17.02
17.2
17.38
17.56
17.74
17.92
17.72
17.91
18.1
18.29
18.48
18.67
18.86
96
97
15.25
15.4
16.2
16.36
17.15
17.32
18.1
18.28
19.05
19.24
98
99
100
15.55
15.7
15.85
16.52
16.68
16.84
17.49
17.66
17.83
18.46
18.64
18.82
19.43
19.62
19.81
59
N_bar
1
2
3
4
rho=.20
1
1.2
1.4
1.6
rho=.21
1
1.21
1.42
1.63
rho=.22
1
1.22
1.44
1.66
rho=.23
1
1.23
1.46
1.69
rho=.24
1
1.24
1.48
1.72
5
6
7
8
9
10
11
1.8
2
2.2
2.4
2.6
2.8
3
1.84
2.05
2.26
2.47
2.68
2.89
3.1
1.88
2.1
2.32
2.54
2.76
2.98
3.2
1.92
2.15
2.38
2.61
2.84
3.07
3.3
1.96
2.2
2.44
2.68
2.92
3.16
3.4
12
13
3.2
3.4
3.31
3.52
3.42
3.64
3.53
3.76
3.64
3.88
14
15
16
17
18
3.6
3.8
4
4.2
4.4
3.73
3.94
4.15
4.36
4.57
3.86
4.08
4.3
4.52
4.74
3.99
4.22
4.45
4.68
4.91
4.12
4.36
4.6
4.84
5.08
19
20
21
22
23
24
25
4.6
4.8
5
5.2
5.4
5.6
5.8
4.78
4.99
5.2
5.41
5.62
5.83
6.04
4.96
5.18
5.4
5.62
5.84
6.06
6.28
5.14
5.37
5.6
5.83
6.06
6.29
6.52
5.32
5.56
5.8
6.04
6.28
6.52
6.76
26
27
28
6
6.2
6.4
6.25
6.46
6.67
6.5
6.72
6.94
6.75
6.98
7.21
7
7.24
7.48
29
30
31
32
6.6
6.8
7
7.2
6.88
7.09
7.3
7.51
7.16
7.38
7.6
7.82
7.44
7.67
7.9
8.13
7.72
7.96
8.2
8.44
33
34
35
36
37
38
39
7.4
7.6
7.8
8
8.2
8.4
8.6
7.72
7.93
8.14
8.35
8.56
8.77
8.98
8.04
8.26
8.48
8.7
8.92
9.14
9.36
8.36
8.59
8.82
9.05
9.28
9.51
9.74
8.68
8.92
9.16
9.4
9.64
9.88
10.12
40
41
42
8.8
9
9.2
9.19
9.4
9.61
9.58
9.8
10.02
9.97
10.2
10.43
10.36
10.6
10.84
60
N_bar
43
44
45
46
rho=.20
9.4
9.6
9.8
10
rho=.21
9.82
10.03
10.24
10.45
rho=.22
10.24
10.46
10.68
10.9
rho=.23
10.66
10.89
11.12
11.35
rho=.24
11.08
11.32
11.56
11.8
47
48
49
50
51
52
53
10.2
10.4
10.6
10.8
11
11.2
11.4
10.66
10.87
11.08
11.29
11.5
11.71
11.92
11.12
11.34
11.56
11.78
12
12.22
12.44
11.58
11.81
12.04
12.27
12.5
12.73
12.96
12.04
12.28
12.52
12.76
13
13.24
13.48
54
55
11.6
11.8
12.13
12.34
12.66
12.88
13.19
13.42
13.72
13.96
56
57
58
59
60
12
12.2
12.4
12.6
12.8
12.55
12.76
12.97
13.18
13.39
13.1
13.32
13.54
13.76
13.98
13.65
13.88
14.11
14.34
14.57
14.2
14.44
14.68
14.92
15.16
61
62
63
64
65
66
67
13
13.2
13.4
13.6
13.8
14
14.2
13.6
13.81
14.02
14.23
14.44
14.65
14.86
14.2
14.42
14.64
14.86
15.08
15.3
15.52
14.8
15.03
15.26
15.49
15.72
15.95
16.18
15.4
15.64
15.88
16.12
16.36
16.6
16.84
68
69
70
14.4
14.6
14.8
15.07
15.28
15.49
15.74
15.96
16.18
16.41
16.64
16.87
17.08
17.32
17.56
71
72
73
74
15
15.2
15.4
15.6
15.7
15.91
16.12
16.33
16.4
16.62
16.84
17.06
17.1
17.33
17.56
17.79
17.8
18.04
18.28
18.52
75
76
77
78
79
80
81
15.8
16
16.2
16.4
16.6
16.8
17
16.54
16.75
16.96
17.17
17.38
17.59
17.8
17.28
17.5
17.72
17.94
18.16
18.38
18.6
18.02
18.25
18.48
18.71
18.94
19.17
19.4
18.76
19
19.24
19.48
19.72
19.96
20.2
82
83
84
17.2
17.4
17.6
18.01
18.22
18.43
18.82
19.04
19.26
19.63
19.86
20.09
20.44
20.68
20.92
61
N_bar
85
86
87
88
rho=.20
17.8
18
18.2
18.4
rho=.21
18.64
18.85
19.06
19.27
rho=.22
19.48
19.7
19.92
20.14
rho=.23
20.32
20.55
20.78
21.01
rho=.24
21.16
21.4
21.64
21.88
89
90
91
92
93
94
95
18.6
18.8
19
19.2
19.4
19.6
19.8
19.48
19.69
19.9
20.11
20.32
20.53
20.74
20.36
20.58
20.8
21.02
21.24
21.46
21.68
21.24
21.47
21.7
21.93
22.16
22.39
22.62
22.12
22.36
22.6
22.84
23.08
23.32
23.56
96
97
20
20.2
20.95
21.16
21.9
22.12
22.85
23.08
23.8
24.04
98
99
100
20.4
20.6
20.8
21.37
21.58
21.79
22.34
22.56
22.78
23.31
23.54
23.77
24.28
24.52
24.76
62
BIBLIOGRAPHY
Abdeljaber, M. H., A. S. Monto, et al. (1991). "The Impact of Vitamin-A Supplementation on
Morbidity - A Randomized Community Intervention
Trial." American Journal of
Public Health 81(12), 1654-1656.
Alexander, F., M. M. Roberts, et al. (1989). "Randomization By Cluster and the Problem of
Social-Class Bias." Journal of Epidemiology and Community Health 43(1), 29-36.
Arceneaux, K. (2005). "Using cluster randomized field experiments to study voting behavior."
Annals of the American Academy of Political and Social Science 601, 169-179.
Berg, D. T. (2010). The Fear of Terrorist Attacks in the Southwestern United States: A Cross
Sectional Analysis. PhD, Arizona State.
Campbell, D. T. S., Julian C. (1963). Experimental and Quasi-Experimental Designs for
Research. Boston, Houghton Mifflin Company.
Campbell, M. J. (2000). "Cluster randomized trials in general (family) practice research."
Statistical Methods in Medical Research 9(2), 81-94.
Campbell, M. J., A. Donner, et al. (2007). "Developments in cluster randomized trials and
Statistics in Medicine." Statistics in Medicine 26(1), 2-19.
Cochran, W. G. (1977). Sampling Techniques. New York, John Wiley & Sons: 428.
63
Coghlan, B., P. M. Ngoy, et al. (2009). "Update on Mortality in the Democratic Republic of
Congo: Results From a Third Nationwide Survey." Disaster Medicine & Public Health
Preparedness 3(2), 8.
Cornfield, J. (1978). "Randomization by group: A formal analysis." American journal of
epidemiology 108, 100-102.
Cox, R. G., L. Zhang, et al. (2007). "Academic performance and substance use: Findings from
a state survey of public high school students." Journal of School Health 77(3), 109-115.
Donald, A. and A. Donner (1997). "Adjustments to the Mantel-Haenszel chi-square statistic
and odds ration variance estimator when the data are clustered (vol 6, pg 491, 1987)."
Statistics in Medicine 16(24), 2927-2927.
Donner (1990). "A Methodological Review of Non-Therapeutic Intervention Trials Employing
Cluster Randomization." International journal of epidemiology 19(4), 5.
Donner, A. (1998). "Some aspects of the design and analysis of cluster randomization trials."
Journal of the Royal Statistical Society Series C-Applied Statistics 47, 95-113.
Donner, A., N. Birkett, et al. (1981). "Randomization By Cluster - Sample-Size Requirements
and Analysis." American Journal of Epidemiology 114(6), 906-914.
Donner, A. and N. Klar (2000). Design and Analysis of Cluster Randomization Trials in Health
Research. London, Arnold.
64
Donner, A. and N. Klar (2002). Issues in the meta-analysis of cluster randomized trials, John
Wiley & Sons Ltd.
Donner, A. and N. Klar (2004). "Pitfalls of and controversies in cluster randomization trials."
American Journal of Public Health 94(3), 416-422.
Eldridge, S., D. Ashby, et al. (2008). "Internal and external validity of cluster randomised trials:
systematic review of recent trials." British Medical Journal 336(7649), 876-880.
Falco, D. (2008). Assesing Students’ Views Towards Punishment:A Comparison Of
Punitiveness Among Criminology And Non-Criminology Students. PhD, Indiana
University, Pennsylvania.
Feng, Z. D., P. Diehr, et al. (1999). "Explaining community-level variance in group randomized
trials." Statistics in Medicine 18(5), 539-556.
Glass, G., V. and K. D. Hopkins (1984). Statistical Methods in Education and Psychology.
Boston, Allyn & Bacon.
Guo, J., R. Whittemore, et al. (2009). "Factors that influence health quotient in Chinese
college undergraduates." Journal of Clinical Nursing 19, 10.
Hansen, M. H., W. N. Hurwitz, et al. (1953). Sample Survey and Methods. New York, John
Wiley and Sons.
Hansen MH, H. W. (1942). "Relative efficiencies of various sampling units in population
inquiries." Journal of the American Statistical Association 37, 89-94.
65
Hayes, R. J., N. D. E. Alexander, et al. (2000). "Design and analysis issues in clusterrandomized trials of interventions against infectious diseases." Statistical Methods in
Medical Research 9(2), 95-116.
Hedges, L. V. and E. C. Hedberg (2007). "Intraclass correlation values for planning grouprandomized trials in education." Educational Evaluation and Policy Analysis 29(1), 6087.
Hsieh, F. Y. (1988). "Sample-Size Formulas For Intervention Studies With The Cluster As Unit
of Randomization." Statistics in Medicine 7(11), 1195-1201.
Hsiu-Lan, T. (1997). The Vocational Interest Structure of Tawianese High School Students. A.
P. Association. Ping Tung City, National Pingtung Teachers College: 19.
Isaakidis, P. and J. P. A. Ioannidis (2003). "Evaluation of cluster randomized controlled trials
in sub-Saharan Africa." American Journal of Epidemiology 158(9), 921-926.
Johnson, N. F., M. Spagat, et al. (2008). "Bias in epidemiological studies of conflict mortality."
Journal of Peace Research 45(5), 653-663.
Kazumune, U., Fujiwara Takeo, et al. (2010). "Does Social Capital Promote Physical Activity?
A Population-Based Study in Japan." PLOS One 5(8), 6.
Kish, L. (1965). Survey Sampling. New York, John Wiley and Sons.
Klar, N. and A. Donner (1997). "The merits of matching in community intervention trials: A
cautionary tale." Statistics in Medicine 16(15), 1753-1764.
66
Lewsey, J. D. (2004). "Comparing completely and stratified randomized designs in cluster
randomized trials when the stratifying factor is cluster size: a simulation study."
Statistics in Medicine 23(6), 897-905.
Lindquist, E. F. (1940). Statistical analysis in educational research. Boston, Houghton Mifflin
Company.
Micceri, T. (1989). "The Unicorn, The Normal Curve, and Other Improbable Creatures."
Psychological Bulletin 105(1), 156-166.
Michigan Governmental Website (2010). Retrieved July 21, 2010, from
http://www.michigan.gov/cgi/0,1607,7-158-54534-240589-,00.html.Montana Office of Public Education (2007). OPI Report. Retrieved June 22,
2010, from http://www.opi.mt.gov/PDF/Measurement/rptDistCrtResults2007.pdf
New York Governmental Website (2009), Department of Labor. Retrieved June 22, 2010, from
http://www.labor.ny.gov/stats/nys/statewide_population_data.asp.
New York State Department of Education (2010). Retrieved June 22, 2010, from
http://www.emsc.nysed.gov/irts/ela-math/.
Neyman, J. (1935). "Statistical problems in agricultural experimentation." Journal of the Royal
Statistical Society Supplement 2, 19.
Olowa, O. W. (2009). "Effects of the Problem Solving and Subject Matter
Approaches on the Problem Solving Ability of Secondary
School Agricultural Education." Journal of Industrial Teacher Education 46(1), 15.
67
Perry, C. L., D. E. Sellers, et al. (1997). "The child and adolescent trial for cardiovascular
health (CATCH): Intervention, implementation, and feasability for elementary schools in
the United States." Health Education & Behavior 24(6), 716-735.
Puffer, S., D. J. Torgerson, et al. (2003). "Evidence for risk of bias in cluster randomised trials:
review of recent trials published in three general medical journals." British Medical
Journal 327(7418), 785-787.
Rao, J. N. K. and A. J. Scott (1992). "A Simple Method For The Analysis Of Clustered Binary
Data." Biometrics 48(2), 577-585.
Rao, P. S. R. S. (2000). Sampling Methodologies with Applications. New York, Chapman and
Hall/CRC.
Raudenbush, S. W. (1997). "Statistical analysis and optimal design for cluster randomized
trials." Psychological Methods 2(2), 173-185.
Raudenbush, S. W., A. Martinez, et al. (2007). "Strategies for improving precision in grouprandomized experiments." Educational Evaluation and Policy Analysis 29(1), 5-29.
Rooney, B. L. and D. M. Murray (1996). "A meta-analysis of smoking prevention programs
after adjustment for errors in the unit of analysis." Health Education Quarterly 23(1), 4864.
Royce, J. M., N. Hymowitz, et al. (1993). "Smoking cessation factors among africanamericans and whites ." American Journal of Public Health 83(2), 220-226.
68
Runyon, R., K. Coleman, et al. (2000). Fundamentals of Behavioral Statistics, Mcgraw-Hill.
Sawilowsky, S. S. (2007). Real Data Analysis, Information Age Pub Inc.
Sawilowsky, S. S. and G. G. Fahoome (2003). Statistics Through Monte Carlo Simulation With
Fortran. Oak Park, Mi., JMASM.
Schulz, K. F. and D. A. Grimes (2002). "Blinding in randomised trials: hiding who got what.”
Lancet 359(9307), 696-700.
Simpson, J. M., N. Klar, et al. (1995). "Accounting for cluster randomization - A review of
primary prevention trials, 1990 through 1993." American Journal of Public Health
85(10), 1378-1383.
Sudman, S. (1976). Applied Sampling. New York, Academic Press.
Tassitano, R., M. Barros, et al. (2010). "Enrollment in hysical Education Is Associated With
Health-Related Behavior Among High School Students." Journal of School Health
80(3), 8.
Upton, G. J. G. (1978). The Analysis of Cross Tabulation Data. Chister, New York, Brisbane,
Toronto, John Wiley and Sons.
Wang, L. and X. Fan (1997). The Effect of Cluster Sampling Design in Survey Research on
the Standard Error Statistic. Chicago, American Educational Research Association: 17.
Wiesberg, H. F., J. A. Krosnick, et al. (1996). An introduction to survey research, polling, and
data analysis. Thousand Oaks, Sage Publications.
69
Yang, L. T. L. a. L. (2008). "Duration of Sleep and ADHD Tendency Among Adolescents in
China." Journal of Attention Disorders 11, 9.
70
ABSTRACT
THE EFFECT OF NONRANDOM SELECTION OF CLUSTERS IN A TWO STAGE
CLUSTER DESIGN
by
JASON PARROTT
December 2012
Advisor: Dr. Shlomo Sawilowsky
Major: Evaluation and Research
Degree: Doctor of Philosophy
Although not as efficient as simple random sampling, cluster sampling has been
regarded as a valid sampling technique when the researcher is attempting to save cost.
However, in order to do so, it is necessary that random selection occurs in all stages of
sampling. This simulation study examines purposeful selection of cluster sampling in the
second stage of a two stage cluster design. Using Monte Carlo methods, a simulation was
conducted comparing the random selection of both stages of a two stage cluster sample to
purposeful selection of the first stage of a two stage cluster sample. The study compares
purposeful selection to random selection by examining the width of the confidence intervals
that are returned for each simulation. After conducting the study, it was evident that using
purposeful selection can yield a confidence interval up to nine times greater than that of its
random counterpart.
71
AUTOBIOGRAPHICAL STATEMENT
My relatively short life of 34 years has been spent immersed in education. I was born
and raised in Royal Oak, Michigan where I attended Royal Oak Schools. I graduated from
Royal Oak Kimball High School in 1996, and attended Central Michigan University the
following year. At Central Michigan University, I earned undergraduate degrees in the areas of
political science and special education. After graduating from Central Michigan University, I
became a special education teacher at Lakeland High School in White Lake, Michigan. While
teaching at Lakeland, I continued my educational career enrolling at Wayne State University,
earning a Masters Degree in the area of Educational Leadership. After I earned my Masters
Degree, I enrolled in the research and evaluation program at Wayne State University and will
graduate with the completion of my dissertation. While completing my doctoral program, I
began to pursue a career in the field of educational administration. I started as the Dean of
Students at Lakeland High School, transitioned to middle school Assistant Principal at Royal
Oak Middle School, and am currently the Principal of Oak Ridge Elementary, in Royal Oak,
Michigan. I have the unique pleasure of being the Principal of the elementary school I attended
as a child. Personally, I have been married for ten years. My wife Laci and I have three
children ages 6, 4, and 2. Laci, Isabela, Olivia, and Maddox are my inspiration and I thank
them for their patience and encouragement during this process. My doctoral program has
been a unique experience and I am looking forward to applying what I have learned in this
process to the next stage of my life.