Supplemental Methods

1
2
3
4
5
6
7
8
9
10
11
Supplemental Materials:
I. Weighted semi-partial correlation matrices
Table 1 (Supplemental material): Weighted semi-partial correlation matrices for Calanoids (upper) and Daphniids (lower) involving
phylogeny and biogeography (left side) and phylogeny and environment (right side). Correlations that are statistically significant are
highlighted in red (with FDR corrections for multiple comparisons) or pink (uncorrected). Blank entries are correlations that could not
be calculated for technical reasons, either due to insufficient data (n of occupied lakes must be greater than number of variables tested)
or because biogeographic contrasts in which all species in clade were absent from any of the biogeographic areas tested. R2adj.is the
regression adjusted coefficient of determination considering all predictors within the Biogeograpical or Environmental sets.
12
13
14
15
16
17
18
1
1
2
3
4
II. Phylogenetic Coding and Calculation of
P .
Note that parts of the text in the paper are repeated here in order to provide a better link

between this expansion on the phylogenetic coding and the paper.
The species-by-node matrix P contains the phylogenetic coding. In principle, there are
5
other possible coding schemes (discussed below), but we use a scheme based on a node-by-node
6
approach inspired by Felsenstein’s phylogenetic independent contrasts (PIC, Felsenstein 1985).
7
We code the species-by-node P matrix to facilitate the calculation of a set of node-by-community
8
statistics contained in the matrix P . In matrix P, all species descending from one of the
9
branches emanating from a node are all arbitrarily given negative values, and those descendeding
10
from the other branch 
are given positive values (which branch is given the negative sign is
11
arbitrary). If a species is not a descendent of a particular node, it is given a value of 0. The sum
12
of species codes from the two branches equal 1 and -1, thus species located in more species-rich
13
branches are downweighted. In the same way that there are several ways of representing
14
phylogenetic diversity in communities (e.g., Caddotte et al. 2010), there are different possibilities
15
for coding phylogenetic information related to the weights that can be given to each species to
16
reflect the phylogenetic topology and branch lengths. For the analyses in this paper, we used a
17
coding that is based on ancestral state reconstruction but does not account for branch lengths
18
(which are not available for our composite phylogeny), but we suggest below a coding scheme to
19
account for branch length variation as well. . In our scheme, codes begin at 1 (or -1, depending
20
on the arbitrary sign of the branch) at the origin of the branch, and are reduced by half at each
21
bifurcating node until the species are reached. Thus a species on a branch with one species has a
22
code of 1, a branch with two species each have a code of (0.5, 0.5), three species gives (0.5, 0.25,
23
0.25), and so forth (for examples, see Figure 2). More formally, each entry in the matrix (species
2
1
i and node j) was given a value of pij  (0.5) d , where dij is the number of intermediate nodes
2
passed on the path between the focal node j and the species in question i. So if the species is the
3
only daughter species on a branch emanating from a node, it has a value of 1, and if it is one of
4
two species on a branch, it has a value of 0.5, and so forth. If the focal node j is not passed
5
between the root and species i (i.e., species i is not a daughter species of node j), the entry has no
6
value (i.e., it takes a value of 0) and that species is not included in the analysis for that node. The
7
coding is quite simple and with the values for the first node (tree root), one can rebuild the entire
8
tree. Once the phylogenetic coding matrix P is built, the community value P for a given node-
9
community pair is simply the sum of the pij’s of all the species (i) descendent from node j, that

10

occur in the local community. This can be achieved by simply multiplying
the incidence matrix I
11
by P (i.e., P  IP Figure 2). Therefore, P is a combined function of the location of the
12
species on the phylogeny and their distribution in the landscape. Note that because of the
13


multiplication process, species having 0 entries for any particular focal coding has no influence
14
on P even if they are present at a particular site.
15


ij
The entries of the node-by-community P matrix are simply the sums of P values for all
16
species that are both descendents of a given node and occupants of the community (note
17

P  IP ). The P values reflect the differential representation of species in the community along
18
the two different branches emanating from a node in the phylogeny. The values in P range
19

between 1, where all daughter species of the one of the branches are present but none of the other
20
are present, and the opposite scenario where all of the second but none of
the first are present: -1.
21
If all species that are daughter of a given node are present in any particular site then the value is
22
0 for the site in question, which means there is no evidence from that community that the two
3
1
branches have a difference in their propensity to occur under the given local conditions. Note
2
that the expected value of P is 0 if all species have the same probability of occurrence, and is
3
nonzero if species from one branch have a different probability of occurrence than species of the
4

other branch, regardless
if the two sides have different numbers of descendent species.
5
However, if one side of the node contains species having in average greater site occupancy (i.e.,
6
occupying more sites regardless if they have greater amount of species or not), then, under
7
chance alone (i.e., random site occupancy), P will have a tendency to be different from zero
8
(smaller or greater depending on weather species having greater average occupancies were coded
9

as negative or positive, respectively. However, because the standardization procedure (see
10
section V) is based on site occupancy, this bias in P is corrected. We used simulations (not
11
shown here) to confirm that the standardization scheme do correct for the potential bias
12

described. Moreover, we also considered
simulations (not shown here) to assess the possibility
13
of statistical bias (i.e., elevated type I error rates in which the probability of our procedure in
14
rejecting the null hypothesis when there is no association between P and a predictor would be
15
greater than the expected alpha; see Peres-Neto et al. 2001 for overall). The results indicate that
16

our entire framework (i.e., contrast, standardization, weighted
regression and permutation tests)
17
has correct type I error rates (i.e., they equal to pre-established alphas 0.01 and 0.05).
18
The matrix P is a combined function of the location of the species on the phylogeny and
19
their site occupancies. Looking across many communities, correlations between these node-
20

community values P and the site variables of those communities reveals that the two branches
21
emanating from the node have diverged in their response to an environmental filter (e.g. different
22

temperature
tolerances) or biogeographic event (e.g. on different sides of a historical dispersal
23
barrier). . This procedure is equivalent to calculating a Phylogenetic Independent Contrast (PIC)
4
1
at the node, by coding the states of all present species 1 and the absent species 0, reconstructing
2
the “ancestral states” of the two branches, and subtracting them. Note again that while the
3
procedure is quantitatively inspired by PIC, we do not interpret our ancestral state values as
4
statements of whether the actual ancestor of the species occurred in the community, but rather
5
the procedure is quantitatively useful coding phylogenetic composition across sites and for our
6
purposes of evaluating differential representation of the two branches across communities.
7
8
Examples: We present a number of illustrated examples intended to demonstrate the
9
phylogenetic coding procedure (creating the P matrix), the calculation of P , and the logic
10
behind both. Consider node III in the phylogeny depicted in figure S1. We use the names abc
11

and d to identify the branches emanating from node III, and leading
to species ABC, and D,
12
respectively (see figure S1). After arbitrarily assigning the species of one of the branches (abc)
13
negative values, the values given by the formula pij  (0.5) ij are (A: -0.25, B: -0.25, C: -0.5, D:
14
1). The dij for A and B is 2, as a path from A or B to node II passes through two intermediate
15

nodes, and C passes 1, D passes 0. The P is simply the sum of all pij of the species that occur in
16
the local community j and are daughters of the node in question. If case 1) all species are present
17

the sum is 0, while case 2) if all
are absent, the sum is also 0. Likewise one could use a
18
mathematically equivalent PIC approach, by coding the states of all present species 1 and the
19
absent species 0, and reconstructing the states of the abc and d lineages, then subtracting them.
20
This gives all species with a value of 1 the first case or 0 in the second case, so the reconstructed
21
branch states are abc: 1 and d: 1, or abc: 0 and d: 0, and the subtraction of both give a value of 0.
22
23
d
Now consider four additional scenarios of occurrence and the calculations
for node III, case 3) D present only, case 4) A present only, case 5) AD present, and
5
1
case 6) CD present. In case 3, one entire branch (d) emanating from node II is
2
present in the local community but no species of the other (abc) are present, which
3
is the maximum difference that can occur in one community. The P of all present
4
species is simply the value for D, 1. Likewise the opposite occurs when only ABC
5
is present, the sum of those scores is -1. Note if we were
reconstructing a PIC for
6
case 1, ABC would get values of zero and D would get a value of 1. The
7
reconstructed branches would still be 0 and 1, respectively, and their subtraction (d-
8
abc) would give us 1.
9
In case 4, A is the only species occurring in the site. There is information in
10
the partial occurrence of the abc lineage, and the absence of species B,C, and D. As
11
A shares recent evolutionary history with B, it individually contributes less to the
12
reconstruction of abc than does species C. The sum of all values ( P ) is simply the
13
value for A, -0.25. This implies that there is a weakly greater representation of the
14

ABC branch to occur in the conditions of the site than the
D branch. Using a PIC
15
approach, we would first take the mean of the states of A (1-present) and B (0-
16
absent), which is 0.5. Then we would take the mean of this value with the state of
17
C (0-absent), which gives us the state of the abc lineage, 0.25. The state of the d
18
lineage is 0-absent. So subtracting (d-abc) gives us our value of -0.25.
19
In case 5, AD are present, so the P for the community is 0.75. This
20
indicates that all species of the d lineage are present, but only part of the abc clade
21
is present, thus indicating that 
d is more represented in the local community. Again,
22
using PIC, we would take the average of A-B, which is 0.5, then average that with
6
1
C (0-absent), to give a value of 0.25 for the abc branch. The subtraction, d-abc,
2
gives us our value of 0.75.
3
In case 6, C and D are present, so the sum of the values (-0.5+1) is 0.5.
4
Using the PIC approach, the mean of A and B is 0, which we then average with the
5
value for C (1), to give us abc= 0.5. The value of d is 1, and after the subtraction,
6
(d-abc), we have our final value of 0.5.
7
III. Incorporating phylogenetic branch lengths
8
9
If branch lengths are available, one can incorporate this information by reallocating
weights reflecting both topology and lengths, while preserving our original scheme in which the
10
weights of each side of a node sum to -1 and 1, respectively. As there many ways to measure
11
phylogenetic diversity (Cadotte et al. 2010), there is likely no single “correct” weighting scheme
12
to code for phylogenetic relationships
13
Indeed, although it is beyond the scope of this paper, we anticipate that an interesting
14
area of future work would be exploring and modifying the phylogenetic coding to reflect
15
different scenarios and applications of the method to increase the statistical power of our
16
framework or perhaps to test different aspects of community phylogenetics.
17
Here we propose a simple scheme to allocate weights based on trees with variable branch
18
lengths, but where each species is the same distance from the root. The coding is based on the
19
intuition that species that share recent evolutionary history should be individually down-
20
weighted because they are not evolutionarily independent. Thus, we down-weight species based
21
on how much branch length they share with other species. We propose the formula for species i
22
and node j, pij  (l1 
23
is the length of the branch that species i shares with 1 other species, and ln is the length of the
l2 l 3
l
 ...  n ) / B , where l1 is length of the terminal branch of species i, l2
2 3
n

7
1
branch that species I shares with n-1 other species. The first term sums over only those branches
2
(l values) between the species i and node j. The second value is the sum of all branch lengths on
3
the one side of node j, and is divided to normalize the values to sum to 1. In the supplemental
4
figure 2, we work through the calculation for several nodes in an example phylogeny with
5
variable branch lengths.
6
Note that, although species on long terminal branches get higher weights, this coding is
7
consistent with the idea that long branches should be down-weighted in reconstructions, because
8
a long branch implies more opportunity to change. In the example (figure S2) for node III, note
9
that while the weight for C >A,B, the sum (A+B)>C. In other words, the common ancestor of A
10
and B, located at node II, is closer to node III and has a greater weight than C. In summary, the
11
individual species A and B are downweighted relative to C because they share evolutionary
12
history, but their combined weight is greater because their mean is likely to be closer to the
13
ancestor than C.
14
IV. The meaning of P
The value of P for a single community simply tells us the representation of the two
15
16

branches on either side of a node relative to each other in the local community. When compared
17

across many
communities, it can be used to correlate representation of different branches with
18
site characteristics. So, if in the example above for node III, the ABC clade tends to occur in
19
shallow ponds but D tends to occur in deep ponds, then P should be correlated with depth
20
across sites. This approach is potentially susceptible to artifacts due to having species with
21

different total occupancies, or other similar effects.
This is why a simple correlation must be
22
compared with appropriate null models to assess significance, which are described in the main
23
text.
8
1
2
V. Standardization procedure of the matrix P of phylogenetic contrast and matrix E of
3
environment
4
5
Matrix P (species x nodes) was standardized as follows:
Pstd  P  1k 1Tk Wk P(1 / trace(Wk )) o

1k 1Tk Wk (P  P) - ((1Tk Wk P)  (1Tk Wk P))(1 / trace(Wk ))(1 / (trace(Wk )  1)

0.5
6
7
where Pstd is the matrix of standardized node contrasts, 1kis a (k x 1) column vector of
8
ones, k is the number of species, T denotes matrix transpose, Wk is a (k x k) diagonal matrix with
9
elements equal to the sum of the columns of the incidence matrix I, where the first non-zero
10
element of Wk (i.e., [Wk(1,1)]) is the sum of the occurrences across all sites of species 1, and so
11
on. Thus, trace(Wk) equals the total sum of all species’ occurrences. (o) denotes the Hadamard
12
multiplication(i.e., element-wise product), (o) denotes the Hadamard division (i.e., element-wise
13
division) and 0.5 denotes the element wise square-root.
14
15


We standardized the environmental E as follows:
Estd  E  1n 1Tn Wn E(1/ trace(Wn ))
1n 1Tn Wn (E  E) - ((1Tn WnE)  (1Tn WnE))(1/ trace(Wn ))(1/ (trace(Wn )  1) 
0.5
16
17
where Estd is the matrix of standardized environmental predictors. We also standardize the
18
biogeographic matrix in the same way by replacing E by B. 1n is a (n x 1) column vector of
19
ones, n is the number of sites (or patches), Wnis a (n x n) diagonal matrix with elements equal to
20
the sum of the rows of the incidence matrix I, where the first non-zero element of Wn (i.e.,
21
[Wn(1,1)]) is the sum of the occurrences across all species for patch 1, and so on. Thus, trace(Wn)
22
also equals the total sum of all species’ occurrences. The complexity of both formulae is due to
9
1
the fact that they standardize all nodes across all species and all environmental variables across
2
all sites at once; though complex, they simply standardize each node or environmental variable
3
by its weighted average and variance, where weights are based on the number of sites occupied
4
by each species (Pstd) and the number of species in each site (Estd ). Note, however, that our
5
method is flexible enough to incorporate other weighting schemes including traditional
6
standardization (i.e., mean=0 and variance=1) where all diagonal values in Wk and Wn are set to
7
one, and schemes in which higher weights are given to species with intermediate occupancy and
8
sites with intermediate richness should have the most information (see Peres-Neto et al. 2006)
9
where diagonal values in Wk and Wn are set equal to the variance of species columns and the
10
variance of site rows, respectively.
11
12
10
1
Figure S1: A graphical depiction of the phylogenetic coding scheme, and calculation of the P
2
statistic. In each case we calculate P for node III in two ways, the method inspired by PIC and
3

the matrix-based method. P is the phylogenetic coding and I is the row from the incidence
4

matrix corresponding to that site.
5
6
Figure S2: An example of a phylogeny with variable branch lengths and an adjusted P matrix
7
accounting for those branch lengths. The basic algorithm is presented and an example is
8
calculated for node III.
11
1
FigureS1:
2
12
1
2
Figure
3
S2:
4
13