Appendix 2: Identifying Physician Networks Our approach to identifying PPCs used group-finding techniques common to social network analysis (SNA). SNA is the term used to capture a set of mathematical and graphical techniques for measuring and analyzing how units are connected to each other).1 In general, a social network consists of two classes of data: nodes (physicians in this case), and edges, which capture the level of interaction between pairs of nodes. Physicians are connected (edges) by the patients they share. The joint sets of all nodes and edges in a population constitute the network. Our method for finding PPCs involved constructing the entire network of physicians who share patients and then identifying smaller groups within the network as the “physician practice communities.” Defining physician networks The first stage of this process is illustrated in the figure at right. Any patient visiting more than one physician creates a link between each pair of physicians they visit. For example, Patient 3 creates links between AE, EF, and AF. Networks extend through overlapping patients; for example, physicians A and F share two patients, and F and C also share two. A and C have no patients in common but are connected indirectly through F. The set of all nodes reachable through a chain of shared patients is called the largest component of the graph. Our interest is in identifying local physician practice communities, so we emphasized strong and relatively local ties in three ways that differ from the simple counting model implied in the figure above. First, we coded each patient’s contribution to the strength of tie between two physicians as the minimum number of times either physician had seen the patient. We then summed over all common patients to identify the total edge strength between the two physicians. Second, if a pair of physicians shared only a single-visit, single patient tie (value 1), we recoded edge value to 0.025. Finally, we excluded ties between pairs that were in the top decile for geographic distance between two physicians. The logic behind weighting ties as the minimum number of distinct visits to either physician, rather than a simple count of shared patients, is to capture the joint familiarity between physicians through shared treatment. If, for example, physician A sees a patient 10 different times and physician B only once, it seems incorrect to count their connection as equivalent to two physicians who each see the same patient 10 times. By choosing the minimum, we discount patients who were shared in a limited way in favor of stronger, more established relationships. Since this is summed overa all shared patients, the number of shared patients is embedded in the measure. Across our 15 state-year combinations, the average correlation between the simple count of patients shared and the sum across all shared patients of the minimum number of days the patient was seen by either physician is 0.87 (std dev 0.01) showing a very clearly linear relation in the scatter plot (on log-log scale, figure available on request). Since the group detection model uses a weighted tie value and these correlate very highly, we expect minimal differences in the final results. The primary purpose of recoding low-value ties created by a single shared patient-visit from 1 to a small-but-nonzero value (we used 0.025) was to mitigate the variability created due to patient sampling and interaction with boarder state physicians and emphasize substantive shared patient relations (in keeping with the insights from Barnett et al (2011)). Upon examination, many low-volume edges were either “pendant” ties -- composed of a single node sharing one patient with another physician in the largest component -- or a wider set of weak ties amongst non-state physicians. While we contemplated removing these edges entirely, we felt giving them a light weight would allow pendant nodes to link to their partners while minimizing the effect of ad hoc ties in creating non-substantive bridges between real groups. To check model sensitivity to this transformation, we compared the results in Pennsylvania in 2010 – the largest network so it has the most recoded values – by running the clustering routines with and without the recoded values. The resulting partition is very similar. For the first stage clustering, used to identify large regional clusters, the resulting clustering between the two approaches matches with a Cramer’s V of 0.97 and an adjusted Rand index of 0.93; indicating a very high correspondence. We then ran the 2nd local-level clustering within each of the larger groups. The average adjusted rand score was 0.88 with a median of 0.92, the difference due to a single low outlier, which upon investigation turned out to be composed entirely of low-volume, out-of-state relations. Both coding schemes consistently assigned within-state physician pairs. Since we remove any cluster that is majority out-of-state from the sample for the care analyses, these results would have had no effect on our ultimate modeling. Thus while we think the lower weighting is a reasonable hedge against ad hoc connections in these lower-sampled ties, it appears to have low substantive effects on the set of resulting PPCs we use in our modeling The choice to exclude long-distance ties is primarily to help bound the network around the region in each selected state. Since edges link nodes indirectly, long chains of patients can connect thousands of physicians into a single component; our preliminary work indicated that even local samples of patients generate nationally extensive networks. For example, the figure at left shows that our sample of patients from Pennsylvania generated links to physicians around the nation (with understandably greater volume within the state, degree is the number of other physicians in this sample each physician is connected to). While this larger network would be important for studying problems such as diffusion, the full structure is not likely to be useful for studying physician practice communities – the local networks in which patients and their physicians are embedded. In large urban areas near state borders, such as Philadelphia, PA or Vancouver WA, it is likely that some PPCs span the border, so a simple hard-coding rule that excluded all out-of-state ties was not appropriate. Since the shape of the distance distribution is similar in all five states, differing only in the scale, selecting a common percentile cutoff from the edge-distance distribution provides a consistent rule that is also tailored to the variable geography of each state. The figure below provides the cumulative distribution of cases by distance for each State and year (within-state distributions are very similar, so points overlap). The 90% cut-off value corresponds to the leveling off of the tail of the distribution. Since the weight of an edge is typically much lower for long-distance ties, this means that the edges removed were largely low-weight ties, and as such, we expect this to have little or no effect on the within-region community detection process. Identifying Physician Practice Communities within the network There is a large literature on the difficult task of identifying communities within networks.2-5 We made primary use of the well-known Blondel model,6 as implemented in the software package PAJEK,7 with all other data manipulations computed in SAS. Two key choices that inform this process are the (a) use of the Blondel et. al detection method rather than alternative detection routines and (b) the selection of the resolution parameter for identifying PPCs within regions. We discuss each in detail below. In general, community detection involves partitioning network nodes into mutually exclusive groups to maximize within-group ties and minimize between group ties. While there are other definitions of network groups, the relative density formulation is by far the most standard and appropriate for this project (Porter et al 2009; Moody and Coleman 2013).4,5 The emerging standard metric for community detection is the modularity index,8 calculated as: 𝑄=∑ 𝐾𝑖 𝐾𝑗 1 (𝐴𝑖𝑗 − 𝛾 ) 𝛿(𝐶𝑖 𝐶𝑗 ) 2𝑚 𝑖𝑗 2𝑚 where m is the number of edges, k is degree, Aij is the edge weight between physicians i and j, d is an indicator that equals 1 if node i and j are in the same community, and g is a “resolution parameter” that identifies the scale at which clustering is observed (Reichardt and Bornholdt 2006, Fortunato and Barthelemy 2007).9,10 Substantively, [kikj/2m] represents the null model – the expected contact between nodes with this degree, so (Aij - [𝛾 𝐾𝑖 𝐾𝑗 2𝑚 ]) is the connectivity above random expectation, normalized by the total volume of ties in the network. Modularity reaches a maximum of 1 if all ties fall within distinct groups and has a value of zero if ties are as likely within as between communities. The resolution parameter is a key feature, particularly as networks become very large (as in the national Figure above). Optimization of Q in these types of geographically grounded large networks tends to be biased toward finding a small number of large groups, so a naïve search to maximize Q can lead to unsatisfactory results as multiple small groups are lumped into larger aggregates. We carefully developed multiple strategies for combating this tendency, resulting in a uniquely tuned tool for identifying comparatively small (~150 physician) groups. The Blondel model uses a local aggregation strategy, first finding many small groups, then treating those resulting groups as a “super-node” in a (now smaller) network, and then repeating this process at the higher level, continuing until no improvement in modularity is identified. In general, such “greedy algorithm” approaches can fail if an initial assignment is poor (as each step builds on the prior step). A unique feature of the Blondel approach is that early assignments are tested against multiple alternatives at later levels, allowing corrections to the assignment process that other “fast and greedy” style algorithms do not. Importantly, the runtime for the Blondel model on sparse graphs (such as ours) is linear in |V|, and thus practical for graphs as large as ours. We spent considerable energy fine tuning and calibrating our community detection algorithm and have high faith in the stability and reliability of the method. Still, plausible alternative community detection routines exist and it is worth discussing the implications of algorithm choice. Landon et al 11 used the Girvan Newman edge-betweenness algorithm. The Girvan-Newman model recursively deletes edges from the network that link parts of the network that are otherwise less connected (those with high “edge betweenness”) – effectively removing the weakest links connecting disconnected sets. This weighting is recalculated after every edge is removed and generates a tree of nodes that remain together the longest. The user then identifies communities based on the edge removal step along the tree that maximizes the modularity score. A key advantage of the G-N algorithm is that it is essentially deterministic, with randomness only coming to play to break edge-weight ties. The critical disadvantage, however, is that the method is computationally intense – making it impossible to implement on networks the size we are dealing with here (our networks have millions of edges). For example, the iGraph implementation of the algorithm has runtime: |V||E|^2. We initially attempted running the iGraph implementation of the G-N algorithm, but it did not return a result after days of running. A theoretically attractive alternative algorithm is the Oslom hierarchical statistical model.12 The Oslom model has an attractive multi-level structure that could be used to automate our two-level process and incorporates a statistical testing framework to assessing cluster detection. Unfortunately but initial tests proved unworkable in these data, as we were unable to get the models to run. Finally, given the dynamic nature of our data, new fully dynamic methods13 are currently not implemented for networks of this scale, but methods to do so might be developed in the future. To identify PPCs in these large regional networks, we used a two-stage implementation of the basic Blondel model, run separately by state. In the top-level stage, we run the model on the full network with a low resolution parameter (0.75) to identify a small number of very large clusters (typically less than 10 in each state). This results in large regional concentrations that are highly segmented (with modularity scores over 0.9). We then re-applied the group detection routine within these large regional clusters with a higher resolution level to identify the PPC. The primary calibration step used here involves setting the resolution parameter. All modularity maximizing approaches (including the GirvanNewman edge-betweenness algorithm) must set a resolution parameter that governs the size of the clusters ultimately identified. This is a fundamental feature of all network clustering routines, analogous to choosing an alpha level for statistical tests or a factor loading cutoff in scale construction. We can use this to our advantage by searching over a wide range of values and selecting a value that returns consistently reasonable results. Result stability over a range of parameters indicates an underlying reality of the groups found. While computationally intensive, this provides a grounded way to choose a parameter. We evaluated a sweep of solutions with resolutions in the range of 0.5 to 2.0 (evaluated in .25 step intervals) with respect to overall clustering, group size (relative to a target median size of between 100 and 150 and maximum less than 1000) and the cluster stability (measured as the adjusted rand index for multiple runs). As a general guide, we computed a composite fit score based on these four values and compared the distribution of fit across the range of resolution values. We repeated this in each state and selected the single resolution value that seemed most robust across all states/years, which produced a consensus resolution parameter value of 1.25. The effect of this calibration step is fairly direct: had we selected a markedly lower resolution value we would have identified a smaller number of larger PPCs (this would have had most of its effect on the maximum group sizes observed) while selecting a markedly higher resolution value would have generated a larger number of smaller groups. While the Blondel algorithm has proved fast and accurate in very large networks, we employed a final node-level reassignment sweep to ensure that nodes were placed in the group where the majority of their neighbors resides3 and affects fewer than 1% of nodes. There are usually many PPCs within the same local area serving the same hospitals. The Figure at right highlights two small PPCs that both admit patients to one hospital in Pennsylvania (both PPCs see patients from other hospitals as well; placement is approximate based on jittered zip code centroid). The mixing matrix is typical of geographically proximate PPCs: while there are shared patients across PPCs, the rate of patient sharing is many times greater within PPCs than between (roughly 5 times higher in this example), and only the strongest ties fall within PPCs (darker edges in panel c).
© Copyright 2026 Paperzz