Estimating k-th Order Spillovers Connor Jerzak1 1 Department of Government and Institute for Quantitative Social Science, Harvard University, 1737 Cambridge Street, Cambridge MA 02138 e-mail: [email protected] January 25, 2016 Abstract Spillover effects are common in the social and biological sciences, but complicate the potential outcomes framework as first articulated by Neyman in the 1920s and later formalized by Rubin in the 1970s. This paper builds from Athey, Eckles, and Imbens’ recent work on obtaining exact p-values in network inference, which allowed for 1-st order spillovers (2015). I provide a generalization of their framework by allowing spillover effects of an arbitrary order when analyzing experimental interventions on a single undirected network. After presenting notation for analyzing k-th order spillover effects, I suggest conditional randomization methods to perform inference, developing a generic procedure for identifying the deepest spillover level present in an experiment. I also propose a causal estimand—the k-th net spillover effect—which captures the effect of an intervention as it ripples out through the treated unit’s peripheral ties in a social network. Lastly, I use these tools to analyze an anti-bullying field experiment in 56 US schools to show how these procedures not only can help correct for statistical biases, but also can reveal new insights into the dynamics of norm formation. Software to analyze spillover effects in social network experiments will be made publicly available in CRAN as socialSpillovers. Contents 0 Introduction 2 1 Notation & exact inference with no interference 1.1 Notation with no interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Exact inference with no interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 2 2 Notation & exact inference with interference 2.1 Notation with interference . . . . . . . . . . . . . . 2.2 Exact inference with interference . . . . . . . . . . . 2.2.1 Breakdown of exact inference . . . . . . . . . 2.2.2 Repairing exact inference . . . . . . . . . . . 2.2.3 A unified testing framework . . . . . . . . . . 2.2.4 Observational data . . . . . . . . . . . . . . 2.2.5 Weakening the Constant Intensity Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 4 4 4 6 9 10 3 Real-world example - a conflict intervention 12 4 Discussion 12 ∗ Preliminary. 1 Jerzak Estimating k-th Order Spillovers 0 Introduction The potential outcomes framework was first formalized by Neyman and Rubin (see Rubin, 2005), but researchers began extending this framework to accomodate unit-level interference in the late 2000s (see, among others, Oakes, 2004). In this period, social media experiments served as clear examples where the observed outcome of unit i might depend on the treatment status of those connected to this unit. However, with spillovers, the potential outcomes framework must be modified, and exact, randomization-based inference breaks down. In this paper, we extend existing claims that, in the presence of interference, we can still evaluate sharp null hypotheses, but only if we focus on portions of the network that embody a randomized experiment with no interference. However, to find these interference-free portions of a network, we are required to make assumptions about how many layers of interference (K) are present in the data. If K = 1, only 1-st order spillovers are present among units. In an undirected network of friends, this situation would imply that the treatment status of unit i’s friends impacts their observed outcome, but not the treatment of unit i’s friends’ friends. However, if K = 2, the treatment status of unit i’s friend’s friends would then impact their observed response outcome as well. In this fashion, K governs the maximum extent to which unit i’s treatment status affects other units, as only those within K degrees of separation are impacted by unit i’s treatment assignment and potentially vice versa. A key issue, however, is that K is unknown a priori. In what follows, we build from Aronow (2012) and Athey, Eckles, & Imbens (2015) as we argue for a procedure that performs a series of hypothesis tests to infer K from the data. The procedure we develop takes into account multiple testing issues, and is crafted to allow for the search for K to be fairly generic. Thus, we argue that spillover effects force us to rethink exact inference: randomization inference is still possible, but requires assumptions that can be made in a principled but data-driven fashion. Moroever, we also contend that spillover effects are substantively interesting, and worthy of study independent from the statistical issues arising from their disregard. The paper proceeds in four main parts. The first section discusses exact inference when interference is not present. The second section closely examines the implications of interference, and develops a sequential test for determining the spillover structure present in a social network experiment. This section also discusses a series of extensions related to this test. The third section illustrates these procedures by replicating Paluck, Shepherd & Aronow (2016), which presented data from field experiment intended to reduce forms of school conflict. The final section concludes. 1 Notation & exact inference with no interference 1.1 Notation with no interference Following the notation in Morgan & Rubin (2010), we here assume that the experiment of interest is 21 , such that all n units either receive treatment (Ti = 1) or control (Ti = 0), with i ∈ {1, ..., n}. The vector Tn×1 denotes the full treatment assignment vector. Often, we abuse notation slightly, writing T when we mean TObs . Now, Y(0)n×1 denotes the potential outcomes for all units under control and Y(1)n×1 denotes the complete potential outcomes for all units under treatment. The entire potential outcomes matrix can be written as Yn×2 = (Y(0), Y(1)). The observed outcomes can be written in this context as YObs,i = Yi (1)Ti + Yi (0)(1 − Ti ). Then, (YObs (T))n×1 denotes the vector of observed outcome values. 1.2 Exact inference with no interference To perform exact inference without interference, we generally begin by assuming a sharp null hypothesis— i.e. that Yi (1) − Yi (0) = 0 ∀ i. In an experimental setting, we observe only Yi (1) or Yi (0) for each unit, but this null allows us to impute YiMis as YiObs . Thus, conditional on this null,“we can empirically create the distribution of any estimator,” g(T, YObs (T)) (Morgan & Rubin 2012). That is, we can generate the randomization distribution for any test statistic that is a function of T and YObs by permuting the treatment assignment vector, and using the imputed value of YiMis to recalculate the test statistic. 2 Jerzak Estimating k-th Order Spillovers Randomization tests have several virtues. First, no distributional assumptions are required, only the exchangeability of disturbance terms—the idea that, under the null, the observed outcomes would be similar irrespective of the level of the treatment variable (Erikson, Pinto & Rader, 2010). Another benefit of randomization inference is that we can assess causal claims without appealing to a model, even in small samples. Moreover, although the null hypothesis is quite strong, Peng (2014) shows that, in completely randomized experiments, rejection of Fisher’s null can be more difficult than the rejection of Neyman’s null (which only assumes an average treatment effect of 0). In this sense, randomization tests can be seen as providing a conservative analysis. Moreover, recall that we can exploit the duality between intervals and tests to form a fiducial interval from a randomization test: we can produce an interval by finding the set of all null hypothesis values that the observed data would fail to reject. Following Lock (2011), we can assume a constant, additive treatment effect in the following calculations (i.e. Y(1) = Y(0) + τ ). The hypotheses are then H0 : τ = τ0 H1 : τ 6= τ0 , (1) and an α-level fiducial interval for τ consists of the set of τ0 such that the observed test statistic would not lead to a rejection of the null hypothesis at significance-level α. Here, when τ0 6= 0, the randomization test is conducted by first constructing Yi (0)∗ = (YObs,i − τ0 ) Ti + YObs,i (1 − Ti ). ∗ Then, keeping Yi (0)∗ fixed, we permute the treatment assignment vector and calculate YObs by adding τ0 to the treatment group outcomes under the permutation. This procedure generates a distribution of τ̂ under the null H0 : τ = τ0 . We can use this distribution to calculate a p-value for τ̂Obs under the null. We also can form an 100(1 − α)% interval for τ by finding the values of τ0 which generate a p-value greater than or equal to α. Garthwaite (1996) discuss an efficient algorithm for obtained randomization-based fiducial intervals, which searches for the interval endpoints using a procedure based on the Robbins-Monro search process. 2 Notation & exact inference with interference 2.1 Notation with interference In the context of interference, we must expand our notation. Assume that there are K levels of interference. ×K }| { z That is, the treatment status of unit i’s friends of friends of ... of friends may influence i’s outcome. Infor×k z }| { mally, we can denote unit i’s friends of friends of ... of friends as this unit’s k-th order friends (or the unit’s friends of depth k). Let Tik denote the binary indicator whether unit i has 0 or more than 0 friends of depth k who receive treatment. For example, Ti0 denotes the traditional treatment assignment vector, with Ti0 = 0 when unit i is in the control group, and Ti0 = 1 when unit i is in the treated group. In a similar vein, Ti1 = 1 if unit i has a treated friend, and Ti1 = 0 if unit i has no treated friends. In this fashion, Tk denote the full binary vector for the k-th spillover level. The potential outcomes have to be rewritten to account for the influence of spillovers. In particular, the potential outcome for unit i is now a function of the treatment status of all units connected to i up to depth K. If K = 1, there are 2(1+1) = 4 treatment options to consider, which include: Ti0 Ti1 = {11, 10, 01, 11} (see Table 1). As a result, Yn×4 = (Y(0, 0), Y(0, 1), Y(1, 0), Y(1, 1)). If K = 2, we have eight treatment options (see Table 2). In general, if the intervention has 2 levels, we implicitly obtain a 2K+1 factorial experiment while performing inference because there are 2K+1 distinct “treatment” combinations assuming K levels of spillover. Lastly, it is important to note that, assuming T0 is randomly assigned such that T0 ⊥ Y, X, then it follows that Tk ⊥ Y, X for k > 0 as well. If we let X denote pre-treatment covariates, this result follows from the fact that, given the undirected network, Tk is a function of T0 only, and is thus independent from the potential outcomes. 3 Jerzak Estimating k-th Order Spillovers Table 1: Treatment combinations assuming 1 levels of spillover. Comb. Comb. Comb. Comb. Treatment indicator (Ti0 ) Spillover 1 indicator (Ti1 ) 0 0 1 1 0 1 0 1 1 2 3 4 Table 2: Treatment combinations assuming 2 levels of spillover. Comb. Comb. Comb. Comb. Comb. Comb. Comb. Comb. 1 2 3 4 5 6 7 8 Treatment indicator (Ti0 ) Spillover level 1 indicator (Ti1 ) Spillover level 2 indicator (Ti2 ) 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 2.2 Exact inference with interference 2.2.1 Breakdown of exact inference In the presence of interference, simple randomization tests are no longer valid. In a randomization tests, we randomly permute the treatment assignment vector assuming a sharp null hypothesis. This procedure depends on the null hypothesis that the observed and unobserved entries in Y are the same. However, with interference, if we randomly permute the treatment assignment vector, the sharp null approach is no longer valid, since permuting whether unit i is treated (i.e. Tj0 ) may alter Tjk and but YiObs is a function of Ti0 , ..., TiK . In less formal terms, if we randomly permute the treatment assignment vector in the presence of spillover, we are not only changing the treatment status of unit i, but also the treatment/spillover regime of other units, since these units are impacted not only by Ti0 but also by Tik as well. Hence, Fisherian inference breaks down because the sharp null (and exact tests) are invalidated by interference. To see this difficulty in action, we can generate arbitrary scenarios where randomization tests will yield faulty inferences in the presence of spillover. Consider, for example, the following data generating process on a random (small world) network with i ∈ {1, 2, ..., 1000}: (2) YiObs ∼ N µ = −0.5Ti0 + 10Ti1 , σ 2 = 0.05 . Hence, there is a −0.5 treatment effect, and a 10 first-order spillover effect. However, if we ignore the spillover effects and instead perform a naive analysis using a randomization test, we obtain misleading inferences. For example, the estimated average treatment effect is −0.14 (not −0.5), and we fail to reject the 0 null. Thus, in this context, we underestimate the overall average treatment effect, and fail to distinguish this effect from 0. In general, spillover effects can cause arbitrarily large levels of bias if researchers fail to account for them. 2.2.2 Repairing exact inference As has been discussed in prior literature (see Aronow (2012) and Athey, Eckles, & Imbens (2015)), we can still use exact inference if randomization tests are performed in a manner that preserves the sharp null. For example, with 1 level of spillover, we can perform a randomization test comparing the outcomes of individuals in subclass with Ti0 = 0, Ti1 = 0 with the subclass with Tj0 = 0, Tj1 = 1. When comparing the two groups, we can permute the reduced T1 vector because doing so would not change the T0 vector. In addition, we could also compare the subclass with Ti0 = 1, Ti1 = 0 and Tj0 = 0, Tj = 0, and could validly permute the reduced T1 vector, since doing so would not alter the values of the reduced T0 . In essence, we condition on the status of all but one of the 2K+1 indicator variables related to treatment and spillover status. With 4 Jerzak Estimating k-th Order Spillovers (a) (b) Figure 1: Ignoring spillover effects can lead to misleading estimates and faulty exact p-values. Here, the null hypothesis is false and should be rejected. The true effect is −0.5. K = 1, we have four direct comparison: • Subclass Ti0 Ti1 = 10 vs. Subclass Ti0 Ti1 = 00 (immediate effect comparison); • Subclass 11 vs. Subclass 01 (immediate effect comparison); • Subclass 01 vs. Subclass 00 (spillover comparison, level 1); • Subclass 11 vs. Subclass 10 (spillover comparison, level 1). Hypotheses comparing these subgroups can be evaluated, since they maintain the sharp null within the randomization test. For example, if K = 2, we can perform a randomization test to compare subclasses 101 and 001, but not subclasses 101 and 000. In the later case, we could not distinguish the influence of T0 from T3 while maintaining the sharp null. However, by comparing 101 and 001 in the former case, we can obtain valid inferences about the treatment effect while maintaining a sharp null (since all units have no treated 5 Jerzak Estimating k-th Order Spillovers friends and at least 1 treated friend of a friend). In sum, if we want to isolate the effect of each spillover layer, we must compare groups that possess one and usually only one difference in treatment combination. The general formulation of this discussion is straightforward. With an arbitrary K, we can find the number of legitimate pairwise comparisons that directly assess hypotheses relating to the K-th level of spillover. In particular, we know that we can validly perform all tests comparing the outcomes of Subclass Ti0 Ti1 Ti2 · · · TiK−1 1 = C 0 C 1 C 2 · · · C K−1 1 vs. Subclass Tj0 Tj1 Tj2 · · · TjK−1 1 = C 0 C 1 C 2 · · · C K−1 0, where the use of C is intended to illustrate that units in both subgroups should have the same treatment/spillover regimes, except at the K-th level. These comparisons generate all the pairwise hypotheses around the K-th spillover level. Notice that there are 2K combinations of the form T 0 T 1 T 2 · · · T K−1 . Thus, if we evaluate every valid pairwise hypothesis around the K-th level of spillover, we must perform 2K tests. We can use these insights to estimate the orders of spillover present in a social network experiment, and one key point is the following. To perform valid randomization inference, we must in essence condition on spillover indices up to the K-th level. If we fail to do so, we may obtain misleading inference, as we saw in the above. However, K is unknown, and the selection of K can be done in a principled fashion. For example, we can in a sequential fashion evaluate all null hypotheses related to spillover level a, where a = G, G − 1, ..., 1. Here, G denotes the a priori maximum prospective level of spillover, and controls the depth of deepest spillover considered. In other words, we assume G ≥ K ≥ 0, where K denotes the true maximum order of non-zero spillover effects. To limit multiple testing issues, we can employ a simple Bonferroni correction, and, for example, reject null hypotheses that yield p-values of 0.05 ÷ (Total # of tests). This procedure is one potential method of determining K (although the interpretability of this method is decreasing in G). Moreover, this approach also depends on the fact stated above that, if T0 is randomly assigned, then Tk is also assigned effectively at random for k > 0. Hence, the randomization of T0 justifies the use of randomization-based inference for the general Tk . To see this method in action, consider the simulated data from above, where the following data generating process on a random network with i ∈ {1, 2, ..., 1000}: (3) YiObs ∼ N µ = −0.5Ti0 + 10Ti1 , σ 2 = 0.05 . Assuming K = 1, we now obtain 2 average treatment effects—one that compares 10 and 00, and another that compares 11 vs. 01 (these values can easily be averaged to obtain a single “overall” treatment effect measure). In both cases, the effect is −0.5, and this value is recovered from the analysis. In addition, we obtain 2 average spillover effects—one that compares 11 and 10 and another that compares 01 and 00. The true effects of 10 are correctly recovered, and the null correctly rejected. Figure 2 presents these results for the 10 vs. 00 comparison. Analogous results hold for the comparison between 11 and 01. This example shows how, by adjusting for spillovers, we can again obtain unbiased estimates and repair issues around exact inference. 2.2.3 A unified testing framework From the above, we know that we have an exponentially increasing number of tests to evaluate the presence of g-th order spillovers. With an arbitrary G, we must perform 21 + · · · + 2G = 2G+1 − 2 total tests, giving an α = 0.05 rejection threshold of 0.05 ÷ (2G+1 − 2). When G = 10, we have 210+1 − 2 = 2046 total tests, and an α = 0.05 rejection threshold of 0.05 ÷ 2046 = 0.000024. Clearly, this approach seems untenable in many situations, especially when the sample size is small. We can instead employ a testing framework that performs a single hypothesis test for each prospective level of spillover from G to 1. With this alternative approach, the α = 0.05 rejection threshold becomes 0.05/G, or 0.005 when G = 10. A single hypothesis test for each prospective spillover level also enhances the interpretability of the results. The test statistic for this unified test can be constructed in the following fashion. First, assume we are analyzing spillovers of level g. Then, we know from the above that we have 2g subgroup comparisons, denoted generically by τ1 , τ2 , ..., τg , where these τ values were formed by comparing units in subclass 6 Jerzak Estimating k-th Order Spillovers (a) (b) Figure 2: By conditioning on each unit’s full treatment status (including spillovers), we correctly reject the null, and obtain an unbiased estimate of the true effect. These results show the 10 vs. 00 comparison. Analogous results hold for the comparison between subgroups 11 and 01. C 0 C 1 C 2 · · · C K−1 1 and subclass C 0 C 1 C 2 · · · C K−1 0. We can form an estimate of the overall effect. If n denotes the total sample size, and ns denotes the number of units used to estimate τs , then ./ τ = g 1 X · τs · ns , n s=1 (4) which we denote as the k-th net spillover effect, and which captures the overall effect of having at least one friend of depth k treated by integrating across all other treatment combinations. We can then perform a valid ./ randomization test on τ to evaluate the overall presence of g-th level spillovers. The global randomization ./ test is valid because each component of τ could itself be tested via a valid randomization test. In other words, we can scramble the treatment/spillover regimes for units in two subclasses differing only in the last spillover indicator, and iterate over all subclass comparisons to form a single test. This single test replace multiple pairwise tests done on the valid subclass comparisons. In addition, because all comparisons evaluate the potential outcomes of subclass C 0 C 1 C 2 · · · C K−1 1 vs. subclass C 0 C 1 C 2 · · · C K−1 0, no unit is “double ./ counted” in the randomization test. That is, each unit contributes to one and only one component of τ (i.e. no unit contributes to both τi and τj for i 6= j). We can use this procedure in the simulated data to correctly determine that K = 2 after setting G = 5. Recall the following data generating process on a random network: (5) YiObs ∼ N µ = −0.5Ti0 + 10Ti1 , σ 2 = 0.05 . This procedure correctly identifies K = 1, and recovers the 2-th net spillover effect of 10. 7 Jerzak Estimating k-th Order Spillovers (a) (b) Figure 3: This method correctly rejects the null of a 0 net spillover effect of order 1, and provides an unbiased estimate of the true effect (10). 8 Jerzak 2.2.4 Estimating k-th Order Spillovers Observational data This framework can be extended to observational data. Indeed, in observational settings, we now must assume that—conditional on confounding covariates—the observed outcomes would be similar irrespective of the level of the treatment variable (Erikson, Pinto & Rader, 2010). In observational data, the potential outcomes may not be independent from the treatment indicator. In the simplest formulation, Y(0), Y(1) 6⊥ T0 . If we assume conditional ignorability, we can say Y(0), Y(1) ⊥ T|X, where X denotes the set of covariates that influence both T0 and the potential outcomes. Thus, if we assume that there is no unmeasured confounding, we can perform a valid randomization test by estimating the τ ’s after conditioning on X at each stage because, after conditioning, the outcomes will again be exchangeable under the null. In other words, randomization inference break down if there are confounders, but we can account for this relationship by conditioning on these variables in the randomization test. Thus, it is possible to employ this spillover framework in an observational setting. Simulations can again illustrate these phenomena. Adapting the earlier simulation framework, assume (6) YiObs ∼ N µ = −0.5Ti0 + 10Ti1 + XiT β , σ 2 = 0.05 , where (Xi )3×1 denotes the pre-treatment confounders and β3×1 denotes the coefficients relating these confounders to the outcome. In addition, Ti is determined by a propensity model that is a function of Xi . We see in Figure 4 that, after adapting the earlier unified procedure to analyze the k-th order net spillover effects by first conditioning on confounders, the approach gains precision in the estimation of the net spillover effects. (a) (b) Figure 4: After conditioning on confounders, our precision greatly improves in the estimation of the net spillover effects. 9 Jerzak 2.2.5 Estimating k-th Order Spillovers Weakening the Constant Intensity Assumption Earlier, we formed treatment/spillover regime subgroups while implicitly making the Constant Intensity Assumption—the idea that, as long as at least one friend of depth k is treated, the number of additional depth k friends does not impact one’s potential outcomes. In some situations, this assumption may be unrealistic, since, for example, it is plausible that norms may change more easily when 10 of one’s friends receive treatment compared to when only 1 receives the intervention. In addition, the sharp null of the randomization test made no distinction between units having 100 treated friends versus 1 treated friend. We can loosen aspects of this assumption while remaining within the horizons of exact inference. For example, we can partition the spillover space based on cut-points, and perform randomization inference on ./ valid combinations of these partitions, as we did earlier when defining τ . Although this partitioning reduces ./ the interpretability of τ , it can provide additional sensitivity to the inferential procedure outlined above, and can also be used to identify non-linearities present in a social system. To clarify this partitioning method, let Sik denote the number of unit i’s k-th order friends who are treated (and Sk denotes the concatenated vector). In this context, T0 still denotes the binary treatment assignment vector, but when k > 0, Tik now refers to a p-level factor variable, where p denotes the number of partitions made based on the cut-points of Sik . In this new framework, we still assume a sharp null between comparison groups, but these groups differ. For example, if K = 2, p = 4, and k > 0, we let Tik ∈ {1, 2, 3, 4} denote membership in subgroups with respect to Sik . In this context, we are now comparing subgroups such as 113 and 112 instead of 111 and 110. Intuitively, the overall test for spillovers gains in sensitivity because the tail-ends units in the tails of the Sik distribution will receive greater weight than previously, when all units received the same weight. Tail-end units receive greater relative weight because they are pooled with fewer units in each subgroup. Hence, this procedure is more sensitive to non-linear spillover effects. Table 3: Treatment combinations assuming 2 levels of spillover, and weakning the Constant Intensity Assumption. S1 and S2 are partitioned into four groups based on cut-points. Treat. Comb Treatment indicator (Ti0 ) Spillover level 1 level (Ti1 ) Spillover level 2 level (Ti2 ) Comb. 1 Comb. 2 Comb. 3 Comb. 4 Comb. 5 Comb. 6 Comb. 7 Comb. 8 Comb. 9 Comb. 10 Comb. 11 Comb. 12 Comb. 13 Comb. 14 Comb. 15 Comb. 16 Comb. 17 Comb. 18 Comb. 19 Comb. 20 Comb. 21 Comb. 22 Comb. 23 Comb. 24 Comb. 25 Comb. 26 Comb. 27 Comb. 28 Comb. 29 Comb. 30 Comb. 31 Comb. 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 To see this method in action, take an extreme case of non-linear spillovers in a social network: YiObs ∼ N µ = −0.5Ti0 − 10 · I Si1 ≥ quantile(S1 , 0.99) , σ 2 = 0.05 , 10 (7) Jerzak Estimating k-th Order Spillovers where I[·] denotes the indicator function. In this data-generating process, most treated units have an average of −0.5, and most control units have an average of 0. However, individuals with an above-average number of friends treated receive a downward shock of −10 to their observed outcome. Because so few units are affected by the spillover process, we sometimes fail to reject the null that there are no spillovers with Constant Intensity Assumption. However, if we loosen this assumption and partition S1 into groups based on its cutpoints, we can more easily distinguish the spillover effect from 0. Figure 5 illustrates these results: with the Constant Intensity Assumption, we obtain a false negative rate of 32 percent in the non-linear dataset; with the weaker assumptions we obtain a 0 percent false negative rate. Overall, we can weaker the Constant Intensity Assumption to account for non-linearities present in social network experiments. Essentially, we do so by providing a more fine-grained partition of each unit’s spillover data. (a) (b) Figure 5: If we weaken the Constant Intensity Assumption, we obtain greater sensitivity. Here, the null hypothesis is false and should be rejected. 11 Jerzak Estimating k-th Order Spillovers 3 Real-world example - a conflict intervention In this section, we replicate Paluck, Shepherd & Aronow (2016), which presented data from a field experiment intended to reduce forms of school conflict. This study encouraged a small number of students to take a public stance against bullying, and network characteristics of the student population were studied to analyze the types of students most influential over social norms. As a result, this data is uniquely suited for examining spillover effects, since such effects are substantively interesting because they relate to how norms are conveyed throughout a social network. With this example, we show how our inferential procedure can identify the structure of the “ripple effect” generated by the anti-bullying treatment. This section will be completed after receiving the data. An application is currently on file with the Princeton IRB. 4 Discussion This paper has developed a unified framework for identifying spillover in social network experiments. We have argued that spillover effects render unreliable standard techniques based on exact inference. To address these difficulties, researchers must account for spillover effects, and must make assumptions about the maximum level of spillover. We have discussed how this decision can be made in a principled, data-driven manner, and hope to release our methods in a CRAN package called socialSpillovers. Lastly, by replicating Paluck, Shepherd & Aronow (2016), we have illustrated how spillovers are substantively interesting. Future work could address the following. First, how could randomization inference be further extended to non-binary treatments? We attempted to address this issue by partitioning the non-binary spillover treatments based on cut-points, but further work should examine this issue in more detail. Second, Campanharo et al. (2011), Marwan et al. (2009), Donner et al. (2010), and others have argued that time series datasets can be mapped into complex networks, making it possible to translate time series into networks and vice versa. Can the notion of spillover as understood in the network context be applied to the analysis of causality in time series? In the end, spillover effects seem to present a promising area for inquiry. On the one hand, these effects are important for obtaining unbiased causal estimates. On the other hand, spillover effects bring insight into how norms or information are conveyed through a network, and how social conventions might evolve in response to an intervention. Thus, as interference brings new challenges to the study of causality, it might also enable new discoveries. 12 Jerzak Estimating k-th Order Spillovers References [1] Aronow, P. M. A General Method for Detecting Interference Between Units in Randomized Experiments. Sociological Methods & Research 41, 1 (Feb. 2012), 3–16. [2] Athey, S., Eckles, D., and Imbens, G. Exact P-values for Network Interference. arXiv: 1506.02084. [3] Campanharo, A. S. L. O., Sirer, M. I., Malmgren, R. D., Ramos, F. M., and Amaral, L. A. N. Duality between Time Series and Networks. PLoS ONE 6, 8 (Aug. 2011). [4] Centola, D., and Baronchelli, A. The spontaneous emergence of conventions: An experimental study of cultural evolution. Proceedings of the National Academy of Sciences 112, 7 (Feb. 2015), 1989– 1994. [5] Ding, P. A paradox from randomization-based causal inference. ArXiv e-prints 1402 (Feb. 2014), arXiv:1402.0142. [6] Donner, R. V., Zou, Y., Donges, J. F., Marwan, N., and Kurths, J. Recurrence networksa novel paradigm for nonlinear time series analysis. New Journal of Physics 12, 3 (2010), 033025. [7] Dwass, M. Modified Randomization Tests for Nonparametric Hypotheses. The Annals of Mathematical Statistics 28, 1 (1957), 181–187. [8] Edgington, E., and Onghena, P. Randomization Tests, Fourth Edition. CRC Press, Feb. 2007. [9] Efron, B., and Tibshirani, R. J. An Introduction to the Bootstrap. CRC Press, May 1994. [10] Erikson, R. S., Pinto, P. M., and Rader, K. T. Randomization Tests and Multi-Level Data in U.S. State Politics. State Politics & Policy Quarterly 10, 2 (2010), 180–198. [11] Fisher, R. A. Statistical methods for research workers,, 4th ed. Oliver and Boyd, Edinburgh, 1932. [12] Liu, J., and Li, Q. Planar Visibility Graph Network Algorithm For Two Dimensional Timeseries. ArXiv e-prints 1411 (Nov. 2014), arXiv:1411.6438. [13] Manly, B. F. J. Randomization, Bootstrap and Monte Carlo Methods in Biology, Third Edition. CRC Press, Aug. 2006. [14] Marwan, N., Donges, J. F., Zou, Y., Donner, R. V., and Kurths, J. Complex network approach for recurrence analysis of time series. Physics Letters A 373, 46 (Nov. 2009), 4246–4254. [15] Neyman, J. On the Two Different Aspects of the Representative Method: the Method of Stratified Sampling and the Method of Purposive Selection. In Breakthroughs in Statistics, S. Kotz and N. L. Johnson, Eds., Springer Series in Statistics. Springer New York, 1992, pp. 123–150. DOI: 10.1007/9781-4612-4380-9 12. [16] Ngamga, E. J., Nandi, A., Ramaswamy, R., Romano, M. C., Thiel, M., and Kurths, J. Recurrence analysis of strange nonchaotic dynamics. Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics 75, 3 Pt 2 (Mar. 2007), 036222. [17] Oakes, J. M. The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology. Social Science & Medicine 58, 10 (May 2004), 1929–1952. [18] Paluck, E. L., Shepherd, H., and Aronow, P. M. Changing climates of conflict: A social network experiment in 56 schools. Proceedings of the National Academy of Sciences 113, 3 (Jan. 2016), 566–571. [19] Pesarin, F., and Salmaso, L. Permutation Tests for Complex Data: Theory, Applications and Software. John Wiley & Sons, Feb. 2010. [20] Rosenbaum, P. R. Conditional Permutation Tests and the Propensity Score in Observational Studies. Journal of the American Statistical Association 79, 387 (1984), 565–574. 13 Jerzak Estimating k-th Order Spillovers [21] Rubin, D. B. Causal Inference Using Potential Outcomes. Journal of the American Statistical Association 100, 469 (Mar. 2005), 322–331. [22] Splawa-Neyman, J. On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9. Statistical Science 5, 4 (1990), 465–472. 14
© Copyright 2026 Paperzz