1.204 Final Project Network Analysis of Urban Street Patterns

1.204 Final Project
Network Analysis of Urban Street Patterns
Michael T. Chang
December 21, 2012
1
Introduction
In this project, I analyze road networks as a planar graph and use tools from network analysis to
measure its structural properties. Network analysis provides a new way to quantify properties of
road networks, which allows us to compare cities and regions with different road layouts. This is
particularly useful in comparing different road layouts, which was the approach in [Cardillo et al.,
2006].
In [Cardillo et al., 2006], road layouts across the world were divided into six categories. For
example, New York City has a “grid-iron” layout, consisting of a planned grid of perpendicular
streets. London has a “medieval” layout, consisting of randomly-oriented streets, resulting from
unplanned growth, and is usually found in older cities. Irvine, California has a “lollipop” layout,
consisting of many cul-de-sacs and dead ends, which is a layout often found in surburban areas.
While it is easy to see that these layouts differ from one another, we will use network properties,
like cost and efficiency, to quantify the difference.
2
2.1
Methods and data
Representing roads as a network
There are two ways that a road network can represented as a mathematical graph. The first way
is a “primal representation,” where intersections are nodes, streets are edges, and street lengths
are weights, resulting in an undirected, weighted network. This representation is intuitive, because
when visualized it resembles a map, as in Figure 1. It also the road’s spatial information of locations
and lengths.
The alternative is a “dual representation,” where nodes are streets and edges are intersections.
This representation discards the spatial info since long streets with many intersections are collapsed
into singles nodes with many edges. This representation focuses on the connectivity of the streets
and is more useful for studying certain properties, like betweenness centrality, as in [Crucitti et al.,
2006]. An algorithm for creating a dual representation are described in [Masucci et al., 2009].
Choosing between a primal and dual representation depends on what properties of the road
network are to be studied. For this project, I wanted to focus on the spatial and geographical
properties, so the primal representation of a network was the better choice.
2.2
Data
The two cities chosen for study in my project are San Francisco, CA, USA, and Oldenburg, Germany. San Francisco, being relatively modern, has large areas with a grid-iron layout, whereas
1
6400
6300
6200
6100
6000
5900
5800
5700
5600
2800
2900
3000
3100
3200
3300
3400
3500
3600
3700
3800
Figure 1: Primal representation of San Francisco’s street network, containing 9322 nodes and 14809
edges.
Oldenburg is a much older cities, with a largely medieval road layout. The primal representations
of the road networks of these cities are shown in Figures 1 and 2.
The data used for this project was obtained online. It was provided by the Computer Science
department at Florida State University1 , as text files with spatial locations of nodes and connectivity of edges. They also have data on a number of other road networks, including other cities and
highways in North America. There are some flaws in the data (e.g. intersections that don’t actually
exist), but these errors are small and will have a negligible impact on the network properties we
will measure.
2.3
Network properties of interest
In measuring network properties of road networks, I follow the same approach as in [Cardillo et al.,
2006]. There are three structural properties that will be analyzed: the meshedness coefficient, the
cost, and the efficiency.
The meshedness coefficient M measures the degree of clustering. It is defined as M = F/Fmax ,
where F is the number of faces in the graph, and Fmax is the number of faces in the maximally
connected graph. The formulas for these quantities, as given in [Cardillo et al., 2006], in a graph
with N nodes and K edges are F = K − N + 1 and Fmax = 2N − 5. The meshedness coefficient
is a replacement for the clustering coefficient that we are familiar with from class. The clustering
coefficient is unsuitable for planar graphs, because it only counts triangles, whereas larger cycles
1
http://www.cs.fsu.edu/~lifeifei/SpatialDataset.htm
2
10000
9000
8000
7000
6000
5000
4000
3000
2000
1000
0
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Figure 2: Primal representation of Oldenburg’s street network, containing 6105 nodes and 7035
edges.
3
6400
6300
6200
6100
6000
5900
5800
5700
5600
2800
2900
3000
3100
3200
3300
3400
3500
3600
3700
3800
Figure 3: Demonstration of how network efficiency is calculated. The blue line represents the
Euclidean distance between two nodes in the network, and the red line traces the shortest path
between the points through the network. The efficiency for these two points is the ratio between
the length of the blue and the red line. Global network efficiency is the average of this ratio, for
all pairs of nodes.
(squares, etc.) are important features of road networks. An illustration of this problem is that
many different planar graphs all have clustering coefficient of 0, such as a square grid and a tree.
The cost W is defined as the sum of the lengths of all edges in the network. It is related to the
real cost of the road network, since the cost of building and maintaining roads increases with the
amount of roads built. The cost is given by the formula:
W =
X
aij lij
(1)
i,j
The efficiency E of a network represents the efficiency of flow within the network, namely how
easily it is to get from one node to another. The local measure of efficiency, for a pair of nodes i
and j, is the ratio of the Euclidean distance dEucl
to the distance along the shortest path through
ij
the network dij . This is illustrated in Figure 3. The global efficiency of the network is the average
value of efficiency for every pair of nodes, and is given by the formula:
E=
X dEucl
1
ij
N (N − 1)
dij
(2)
i,j,i6=j
2.4
Theoretical cases as baselines for comparison
In other studies of networks, the random graph and the complete graph frequently serve as extreme
cases and are useful for comparison. However, these theoretical models not appropriate for planar
graphs, because the networks have intersecting edges. Instead, [Cardillo et al., 2006] proposes new
theoretical models to represent the minimally-connected and maximally-connected cases.
4
Figure 4: Illustration of theoretical cases used for comparison. Top-left: Map of a part of Savannah,
GA. Top-right: network representation of roads. Bottom-left: minimum-spanning tree of the network. Bottom-right: greedy triangulation of the network. Image reproduced from [Cardillo et al.,
2006]
The minimal case with the fewest number of edges is the minimally-spanning tree (MST).
This is a tree uses the minimum number of edges to ensure that all nodes are connected in one
component, and in a way that minimizes the total length of edges. The maximal case is the greedy
triangulation (GT). This is a graph that creates a maximally-connected planar graph by drawing
triangles between nodes wherever possible, but also minimizes the total length of edges. An example
of these cases are depicted in Figure 4.
The minimally-spanning trees for San Francisco and Oldenburg road networks used in this
project are shown in Figures 5 and 6.
Using these baseline cases, we define relative cost Wrel and relative efficiency Erel . These
are useful because it allows comparison across cities, since the values are normalized and account
for differences in node layouts between cities. However, I didn’t have time to write a greedy
triangulation algorithm for the project, so only the MST is available for comparison. Consequently,
I will be measuring cost of the network as a multiple of the cost of the MST, and measuring simply
the absolute values of efficiency.
W − W M ST
W GT − W M ST
E − E M ST
= GT
E − E M ST
Wrel =
(3)
Erel
(4)
5
6400
6300
6200
6100
6000
5900
5800
5700
5600
2800
2900
3000
3100
3200
3300
3400
3500
3600
3700
3800
Figure 5: Minimum-spanning tree for San Francisco’s street network.
10000
9000
8000
7000
6000
5000
4000
3000
2000
1000
0
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Figure 6: Minimum-spanning tree for Oldenburg’s street network.
6
Degree distribution
0.7
San Francisco
Oldenburg
0.6
Frequency, as a fraction
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
Degree
Figure 7: Degree distribution for street networks in San Francisco and Oldenburg. Notice that
the average degree of Oldenburg’s street network is lower, because its layout involves less 4-way
intersections, which are usually only found in a grid-like layout. Compared to other networks, this
degree distribution is very narrow, due to the planarity constraint.
3
3.1
Results
Degree distribution
Figure 7 shows the distribution of node degree for both road networks. The very narrow range of
degree is a result of the planarity constraint. The average degree is lower in Oldenburg than in San
Francisco because its road layout uses 3-way intersections more than the 4-way ones found in grids.
This is characteristic of an older street layout that was created in an unplanned, self-organized
way. The results from [Cardillo et al., 2006] confirm this trend on a larger scale, where the findings
indicate that, in general, P (k = 3) > P (k = 4) for self-organizaed street layouts, as in Cairo and
Venice, but the reverse is true for planned cities, such as New York, San Francisco, and Washington.
Figures 8 and 9 are maps of San Francisco and Oldenburg with intersections colored by node
degree. The results are expected, and the map reinforces our expectation from what we see visually.
Intersections in grid-like areas have degree 4, and intersections in areas with irregular street patterns
(like the hilly area in SF, in Figure 8) have degree 3. Intersections in Oldenburg in general have
lower degree overall. Interestingly, it appears that the nodes with higher degree are spaced randomly
in SF, but are more concentrated in the city center for Oldenburg. This may reveal a tendency for
roads to connect to existing intersections in self-organized growth, but not in pre-planned layouts.
7
6400
7
Grid
6300
6
6200
5
6000
5900
4
Hills
Degree
6100
3
5800
2
5700
5600
2800
2900
3000
3100
3200
3300
3400
3500
3600
3700
3800
1
Figure 8: Map of node degree, for San Francisco’s street network. Note the contrast in typical
intersection degrees between the grid-like and hilly regions.
Network
San Francisco, CA
MST
GT
Oldenburg, Germany
MST
GT
San Francisco, CA (Cardillo et. al)
Irvine, CA (Cardillo et. al)
London, UK (Cardillo et. al)
Meshedness
Coefficient
0.294
0
1
0.076
0
1
0.309
0.014
0.249
Cost
(relative to MST)
2.29
1
Efficiency
1.369
1
0.761
0.394
0.840
0.334
0.792
0.374
0.803
Table 1: Network properties of road networks. The last three rows are taken from [Cardillo et al.,
2006].
8
10000
5
9000
4.5
8000
4
7000
5000
3
4000
2.5
3000
2
2000
1.5
1000
0
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Figure 9: Map of node degree, for Oldenburg’s street network.
9
1
Degree
3.5
6000
3.2
Structural properties
Table 1 compares the three properties of interest for various cities.
Oldenburg has a much lower meshedness coefficient than San Francisco or any other grid-like
city. This is expected, because grid-like cities have intersections that are more clustered, whereas
medieval layouts may have intersections near each other but don’t have a road between then.
Irvine, CA also has a low meshedness coefficient, due to the disconnectedness of its roads. Low
meshedness coefficient also reflects the relatively fewer number of streets compared to intersections
in these cities.
The road network in San Francisco costs 2.29 times more than its MST, but Oldenburg costs
only 1.3 times more. The Oldenburg road network is more minimalist, containing fewer redundant
connections, but this may increase difficulty in navigation. This may be a result of self-organized
growth, because people may prefer to build streets only when absolutely necessary to connect two
points. This difference in cost could be predicted from noticing that more streets are lost in building
the MST for San Francisco than for Oldenburg.
San Francisco’s road network has a slightly higher efficiency than Oldenburg, and both have
much higher than either MST, and Irvine. Interestingly, San Francisco’s planned road network
performs similarly to Oldenburg’s self-organized network in terms of efficiency, despite having a
higher cost. Also, notice that Irivine has a similar efficiency to the MSTs, which is expected, since
the structure of the lollipop layout resembles an MST, consisting of a tree-like structure with many
dead-ends. This result highlights the inefficiencies associated with such a road layout.
4
Conclusion
These results can be useful in trying to design a transportation-effective and cost-effective road
network. While in this project I only looked at the cost and efficiency of two cities, [Cardillo et al.,
2006] compared 20 cities, and the results are shown in Figure 10. This graph plots each city on a
graph of efficiency vs. cost. First, there seems to be an upper limit to efficiency. Secondly, we see
that efficiency increases with cost, up to a certain point. It seems that the grid-iron layouts cost
more than medieval layouts without achieving higher efficiency. Finally, the trend in this graph
may imply an intrinsic trade-off between efficiency and cost independent of road layout. However, it
is important to note that these metrics don’t account other important properties of road networks,
such as difficulty of navigation or amount of information required to locate an address.
Further steps for this project would be to perform the analysis on larger-scale road networks
(i.e. highways) to see if similar trends can be observed. I hypothesize that the network properties
(meshedness coefficient, cost, and efficiency) would be different from those of local roads because
highways are designed with different rules and goals. Future work could also account for other
properties of road networks, such as betweenness centrality, traffic flows, or susceptibility to congestion. One possible question would be, “how many roads can be removed before the network fails
catastrophically and leads to congestion?”
References
Alessio Cardillo, Salvatore Scellato, Vito Latora, and Sergio Porta. Structural properties of planar
graphs of urban street patterns. Physical Review E, 73(6):066107, June 2006. ISSN 1539-3755.
doi: 10.1103/PhysRevE.73.066107. URL http://link.aps.org/doi/10.1103/PhysRevE.73.
066107.
10
Figure 10: Efficiency vs. cost for various real-world road networks.
Paolo Crucitti, Vito Latora, and Sergio Porta. Centrality measures in spatial networks of urban
streets. Physical Review E, 73(3):036125, March 2006. ISSN 1539-3755. doi: 10.1103/PhysRevE.
73.036125. URL http://link.aps.org/doi/10.1103/PhysRevE.73.036125.
a. P. Masucci, D. Smith, A. Crooks, and M. Batty. Random planar graphs and the London
street network. The European Physical Journal B, 71(2):259–271, August 2009. ISSN 14346028. doi: 10.1140/epjb/e2009-00290-4. URL http://www.springerlink.com/index/10.1140/
epjb/e2009-00290-4.
11