A Framework for Automated Extraction and Classification of Linear

LFX-551.qxd
11/9/04
16:25
Page 1373
A Framework for Automated Extraction and
Classification of Linear Networks
G. Priestnall, M.J. Hatcher, R.D. Morton, S.J. Wallace, and R.G. Ley
Abstract
This paper presents a framework for extracting networks of
linear features such as roads from imagery using an objectoriented geodata model. The proof of concept approach has
resulted in the Automated Linear Feature Identification and
Extraction (ALFIE) which uses a control strategy to automate
the process flow. The resulting system is highly flexible, incorporating a toolkit of algorithms and imagery to extract linear
features and utilizes contextual information to allow evidence
of class membership to be built up from a variety of sources.
The classification algorithm employs a Bayesian modelling
approach. This incorporates both geometric and photometric
information of which five key discriminators were identified:
width, width variation, sinuosity, spectral value, and spectral
value variation. This paper presents an in-depth discussion of
the processes undertaken by the ALFIE system and quantitative results of the final output from the system in terms of
classification accuracy and network completeness.
Introduction
Linear features such as road networks are fundamental components of many vector GIS databases and are increasingly required for a range of multi-dimensional modelling applications. Although such features are recognizable by humans in
the majority of medium to high resolution remotely sensed
imagery, the task of algorithmically discriminating between
roads and other linear features observable in imagery is complex and calls for an approach based upon objects rather than
pixels. The properties of objects and their placement within
the wider scene must be considered in order to attempt to utilize some of the contextual knowledge used by humans. This
paper presents an approach to managing the complexity of
this recognition problem which involved the development of
a flexible and extensible system set within an object-oriented
spatial database environment.
The Automatic Linear Feature Identification and Extraction (ALFIE) project was led by QinetiQ© (formerly the British
Government’s Defense Evaluation and Research Agency) and
involved the School of Geography at the University of Nottingham and Laser-Scan Limited. It was funded by the UK
Ministry of Defense Corporate Research Programme. The project was driven by the need to rapidly populate military Synthetic Natural Environment (SNE) databases for use within
simulations. Standard military datasets are typically used to
provide the bulk of the data for an SNE database. However,
G. Priestnall is with the School of Geography, University of
Nottingham, Nottingham, NG7 2RD, UK (gary.priestnall@
nottingham.ac.uk).
M.J. Hatcher, S.J. Wallace, and R.G. Ley are with the Space
Department, QinetiQ Ltd., Cody Technology Park, Ively Road,
Farnborough, Hampshire, U14 0LX, UK.
R.D. Morton is with Laser-Scan Ltd., Cavendish House,
Cambridge Business Park, Cambridge, CB4 0WZ, UK.
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
such datasets may not be available for the specific area of interest, may be at an inappropriate scale, may require augmentation and filtering, and may be based on out-of-date mapping.
The requirement therefore exists to generate tailored, up-todate geospatial data in a cost effective manner. The geospatial
data and attribute requirements for such databases are varied
and include terrain models, as well as, vector datasets. Of the
latter, linear networks are important elements. It is in this area
(i.e., the extraction of linear features including, but not limited to, roads) that the initial ALFIE research has concentrated
to enable the devised framework to be proven. Although the
requirements for the research have been driven by military
SNE databases, the strategy presented has direct implications
for operational automated map production and revision
systems.
Commercial satellite imagery, which is now readily available for all areas of the world, provides an important source of
geospatial data for the production of SNE databases, however
efficient operational strategies for extracting feature information from these images need to be developed. The ALFIE project
addresses this capability gap.
The ALFIE approach is based upon a number of fundamental concepts:
•
•
•
•
•
The use of imagery of different spatial and spectral resolutions;
The exploitation of pre-existing linear extraction algorithms;
Development within an object-oriented framework;
Improving the intelligence of the extraction through the use of
spatial and aspatial context;
Automating the complete extraction and classification process
through a control strategy.
The ALFIE system has been designed to accommodate
knowledge of a range of source imagery and linear feature extraction algorithms. The availability of various types of imagery in different parts of the world will vary, and therefore,
there is a desire to automate the selection of the most appropriate image and algorithm combination. An important element of the approach is to allow the inclusion of new imagery
and algorithms into this knowledge base in the future. The
imagery of interest includes both panchromatic and multispectral optical datasets with spatial resolutions ranging from
30 m to 1 m.
In previous research (Priestnall and Wallace, 2000;
Wallace, et al., 2001) the importance of contextual information and the use of an object database were presented. In this
paper, the details of an entire framework for linear feature
extraction are presented, and the importance of a control strategy to automate the many component modules is emphasised.
Photogrammetric Engineering & Remote Sensing
Vol. 70, No. 12, December 2004, pp. 1373–1382.
0099-1112/04/7012–1373/$3.00/0
© 2004 American Society for Photogrammetry
and Remote Sensing
December 2004
1373
LFX-551.qxd
11/9/04
16:25
Page 1374
Approaches to Linear Extraction
The Nature of the Problem
The major requirement for the ALFIE system is the rapid generation of linear networks from whatever source imagery is
available. As with the majority of applications utilizing such
spatial information, the most useful form for the end product,
in terms of geometry, is a structured network of vector centerlines. The ability to selectively display objects, change their
representation (e.g., symbology and color), and add attributes
is particularly important if the linear features are going to contribute towards visualization and modelling environments.
The problem at hand therefore is to extract discrete objects
that will ultimately have attribution, topological structure and
be part of an inter-related group of geographical objects. The
emphasis for Commercial Off The Shelf (COTS) software has
been pixel-based, with the object-based approaches being
largely within the research domain. For example, Hofmann &
Reinhardt (2000) discuss the prospects for land cover classification based around segmented objects rather than pixels.
The extraction of discrete objects from image data is essentially a machine vision or more precisely an image understanding problem common to many domains including industrial inspection and military target recognition (Sonka, et al.,
1999). The extraction of geographical objects from remotely
sensed imagery could be considered more challenging than
many other domains for reasons that include:
•
•
•
A wide variety of image characteristics (including spatial and
spectral resolution);
With industrial applications the inspection is terrestrial and at
close range and therefore the environment can be controlled.
In comparison, use of airborne or satellite imagery brings the
atmosphere into play and this can have significant effects on
the representation of the features within the imagery;
The characteristics of geographic objects and their inter-relationships vary enormously.
Many image understanding applications utilize the attributes of objects from an early stage in the extraction process to
establish evidence for classifying those objects. Image processing algorithms extract low level primitives of various kinds
which form the initial set of unclassified objects. For example,
although road features are usually linear when considering
the target vector dataset, the nature of these objects in the
image means that several approaches to initial extraction
arise, namely the extraction of the edges of the road, the region defining the road surface, or the centreline of that road
feature.
For each technique the optimisation of the extraction results is an active area of research, and the advent of finer resolution satellite imagery presents ever more complex challenges. For the extraction of edges, many combinations of
edge detection filters and pre-processing techniques can be
considered (Forghani, 2000). For such edges to be useful, techniques must be developed to recognise pairs of matching
edges or groupings of edge representing features such as junctions (Teoh and Sowmya, 2000). As image resolutions become
finer, linear features can be considered as elongated areas and
region-based analysis implemented, for example, by a selforganizing map (Doucette, et al., 2001), or by a semantic
model of the linear structure (Hinz, et al., 2001).
Degrees of Automation
The challenge of image understanding could be regarded as
one of varying complexity when viewed across a wide range
of disciplines. When the object or pattern has quite a predictable shape, size, and type, then reliable total automation
can be achieved. With geographical imagery however, the
characteristics of objects within the image scene can vary
1374
December 2004
enormously. The problems of developing transferable rules
for automated object extraction have been recognised for
many years (McKeown, et al., 1985). As a result of geographical objects being so variable, attempts to extract them in a totally automated fashion have been largely unsuccessful unless
restrictions are placed upon the source image type or the characteristics of the target object. Even within the field of data
capture from scanned paper maps, where objects follow certain rules of cartographic representation, user guidance has
been necessary to enable successful object extraction in a production environment.
Semi-automated approaches often involve the manual
identification and seeding of a certain type of object, such as
a road, the geometry of which is then extracted by a sequential
line following algorithm (Vosselman and de Knecht, 1995) or
the active curve fitting approach of snakes (Gruen and Li,
1997).
An alternative approach is to reduce the search space for
objects by using existing map data to guide the extraction
process (Bordes, et al., 1997). Such approaches must address
issues of cartographic generalisation and in particular the degree to which positional information can be relied upon
(Priestnall and Glover, 1998; Abramovich & Krupnik, 2000).
The Introduction of Context
Attempts to increase the level of automation may utilize some
of the contextual information which humans employ when interpreting an image. The placement of an object within the
wider scene and its inter-relationships with other objects at a
range of scales would constitute general contextual knowledge (Priestnall and Wallace, 2000). When putting these broad
concepts into practice, more specific mechanisms for representing contextual clues are described. Contextual regions
and local rule-based sketches (Baumgartner, et al., 1997) represent different levels of spatial context. Containment within
broad land use regions influences the type of object patterns
observed, and at the local level certain rules can describe
commonly observed inter-relationships between objects of
different types. Local relationships between roads and linear
groupings of buildings are presented by Stilla and Michaelson
(1997). In addition to knowledge contained within a single
scene, collateral evidence from other imagery can be considered (Tonjes and Growe, 1998).
When classifying linear features such as roads, characteristics including the reflectance properties and the geometric
pattern of the extracted linear primitive are taken into account
(Vosselman and de Knecht, 1995). The importance of establishing a wider network of topologically connected objects,
as recognised by Steger, et al. (1997), could be regarded as introducing one aspect of contextual knowledge into recognition strategies. Wang and Trinder (2000) illustrate the use of
simple rules to eliminate non-road features that are isolated
from the wider network after hierarchical grouping of line segments. The concept of a wider, connected, functional network
can be used to build up and evaluate networks in an automated fashion where high quality extraction results and reference networks are available (Weidmann, 1999).
A Framework for Managing Contextual Information
In order to begin to exploit contextual knowledge and to provide a high degree of automation, a framework must be designed that is powerful, flexible, and extensible. The complex
nature of object properties and inter-relationships suggests
the use of an object-oriented geospatial database. The requirement for rapidly extracting linear networks utilizing a range of
both source imagery and extraction algorithms with a high
degree of automation has led to the development of a software
control strategy to manage the complexity of the whole
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
LFX-551.qxd
11/9/04
16:25
Page 1375
process. The following section presents the control strategy
developed within the ALFIE project.
Control Strategy
The aim of the ALFIE project was to establish a framework that
facilitates the automatic extraction, identification, and attribution of linear features from imagery available at the time.
The resulting system adopts an approach to automation that
minimizes human interaction up to a final editing stage. A sequence of operational modules referred to as the process flow
allows the basic principles of the ALFIE approach to be implemented. Decisions are required at each stage as to the appropriate actions to be taken, and to replace human input into
this decision making process a control strategy is required.
The core software component of the control strategy, termed
the control interface, utilizes lookup tables to initiate the most
appropriate action at each stage and controls information flow
between these stages, termed control modules. Figure 1 summarises the operations required to implement the automated
procedure and their organization into a process flow of control modules, described in detail in a section to follow.
The preparation stage involves automated selection of
available imagery, linear extraction algorithm, and appropriate
parameters. The combination of image and algorithm selected
determines the nature of pre-processing undertaken in the next
stage. Contextual information is utilized in several ways beginning with the influence of broad landscape regions upon the
choice of extraction technique and parameters. During the collateral extraction stage, supporting contextual evidence is gathered to refine later classification of linears and to limit the
search space for network construction. This includes the capability to utilize generalized network information provided by
the standard military vector product VMAP (NIMA, 1995) where
this is available; however, the system is not reliant on this
data. Centerlines of linear features in the image are extracted
and vectorized, remaining unclassified primitives at this stage.
Certain contextual filtering based upon water and vegetation
masks removes some primitives, and local snapping attempts
to build longer linear objects. The classification stage involves
attributing each linear object with geometric and photometric
properties that are then used to discriminate between roads,
railways, and rivers using a cluster-weighted model. A structured network of linear features is important; therefore, operations to build a topological network are initiated. Validation
statistics and graphics enable the user to assess the degree of
success before a final user-guided editing stage.
The entire procedure is based around the idea of having
vector objects, the properties of which provide discriminating
evidence that becomes attribution to aid the classification of
that object. Incorporation of contextual information both
locally and regionally and the operations to construct a more
globally consistent network are fundamental. In order to implement this a requirement exists for a powerful and flexible
spatial database development environment to allow potentially complex object schema to be designed and appropriate
functionality to be associated with different classes of object.
Object-Oriented Database Technology
One approach to incorporating contextual information is to
build it into the extraction algorithm. This may be appropriate
if a single algorithm is used, but it significantly reduces the
flexibility of a framework where a toolkit of algorithms is required to handle a variety of image types and resolutions. A
more flexible approach is to divorce the contextual rules from
the extraction process. Unclassified linear objects then require
a technique to store complex attribution for each object representing its characteristics and context within the wider scene.
Object-oriented geospatial databases offer a good platform on
which to develop such techniques.
The Object-Oriented (O-O) approach has prevailed for
some time in computer science and offers a logical and flexible way of modelling the real world (Worboys, 1994). Following this approach, real world entities can be abstracted and
held as objects of certain types, each type being characterised
by certain attributes and operations. When implemented in
code, each object type becomes a class with attributes represented by data types ranging in complexity from simple integers to spatial geometries. The operations encapsulated within
each object class are implemented as software methods.
Object-Orientation has been fundamental in the approach
taken to linear classification by the ALFIE project. Extracted
linear features are maintained as objects within the LaserScan O-O spatial database. By defining suitable methods it becomes possible to interrogate primitive linear objects for contextual information that can be used for their classification.
Table 1 lists the methods defined to assist the classification of
linear objects within the ALFIE system. These value methods
dynamically extract attributes from both source image and extracted linear vector primitives. As this information is derived
TABLE 1.
VALUE METHODS USED TO REPRESENT GEOMETRIC, PHOTOMETRIC AND
ELEVATION PROPERTIES (AF TER WALLACE, ET AL., 2001)
Broad Type
of Attribute
Geometric
Geometric/
Photometric
Photometric
(a)
Elevation
(b)
Figure 1. Components of the ALFIE system: (a) Operations undertaken by the control modules; (b) The process flow of
control modules guided by the control strategy.
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
Attribute Derived by a Value Method
Length of linear
Straight-line length from one
end of linear to other
Total curvature along the linear
Maximum angle of turn along the linear
Sinuosity index (length/straight-line length)
Orientation of the linear
Width of linear (similar spectral values
alongside centre-line)
Variation in width along the linear
Dominant spectral value along the linear
Mean spectral value along the linear
Variation in spectral value along the linear
(standard deviation)
Gradient of spectral value along the linear
Number of significant discontinuities in
spectral value along the linear
Dominant elevation along the linear
Mean elevation along the linear
Variation in elevation along the linear
(standard deviation)
Gradient along the linear
Number of significant discontinuities in
elevation along the linear
December 2004
1375
LFX-551.qxd
11/9/04
16:25
Page 1376
on the fly by the method, rather than being stored as a static
attribute, the information can be guaranteed to be up-to-date,
automatically honoring any changes made to the database.
Each vector primitive extracted from the image is stored
as a separate object of class unclassified linear with the ability
to derive attributes using the value methods (shown in Table 1)
encapsulated within that object. Given this underlying data
model, the extraction of linears and population of this geospatial database is initiated by the control strategy which then attempts to place objects within class single-carriageway road,
dual-carriageway road, railway or river, and build up road
class objects into a complete network.
Process Flow Stages
One of the central tenets of the ALFIE approach is its modularity. The control modules form discrete stages of the process
flow, the order of execution of these modules and their interactions being governed by the control interface. These two
components form the control strategy, introduced in a previous section. The sections following describe in more detail
the processing undertaken by the control strategy.
Preparation
An integral part of ensuring a good quality output is the selection of imagery and relevant algorithms to perform the initial
linear extraction. A number of algorithms available within
QinetiQ© and in the public domain have been investigated.
Each of these has been thoroughly tested to determine the
suitability for different image types and domains. The optimal
parameters with which an algorithm is best applied have also
been differentiated. The results from this testing form a set
of rules that are implemented in the form of look-up tables.
The look-up tables are structured around the following input
parameters:
•
•
•
•
Imagery available;
Linear feature types required;
Level of confidence required;
Timescale permitted for processing.
The choice of imagery, algorithm, and parameters are affected by context regions, derived from land cover classification of coarse resolution imagery. Separate look-up tables
have been created for the context region types supported by
the ALFIE database schema namely, rural, urban, and forest.
Table 2 shows an example of a look-up table for a particular
context region showing a matrix of image type and linear
extraction algorithm. For a given context region and image
type available, the extraction algorithms which are available
and appropriate are highlighted, and of those, the optimal
choice is selected along with the appropriate set of parameters
(although these are not illustrated here).
Pre-Processing
A major assumption of the ALFIE project is that imagery made
available for input has been suitably georeferenced and radiometrically corrected, and working under this assumption the
TABLE 2.
EXAMPLE LOOK-UP TABLE USED BY THE PREPARATION
CONTROL MODULE
Algorithm
Image Type
Coarse—Resolution Multispectral
Coarse—Resolution Panchromatic
Mid—Resolution Multispectral
Mid—Resolution Panchromatic
High—Resolution Multispectral
High—Resolution Panchromatic
Susan
Linefinder
MSEL
*
*
*
*
*
*
o
o
o
o
o
o
o
o algorithm is appropriate for use with this imagery; * algorithm
is optimal for this imagery.
1376
December 2004
potential effects of poor georeferencing have not been rigorously analysed within the project. Despite this, the imagery in
its native form may not be optimised for the extraction algorithm selected in the previous control module. To improve the
results of the extraction, tailored pre-processing such as edge
enhancement to bring out linear edges, or smoothing and segmentation to reduce noise is appropriate in some cases. The
required parameters for the pre-processing look-up table have
been derived through sensitivity analysis, and their selection
is fully automated. The introduction of new imagery or algorithms to the system would require additional preparatory
analysis to update the pre-processing look-up tables.
Collateral Extraction
The majority of the feature information required comes from
the linear extraction process. However, the classification and
network building components require additional contextual
information. These data are derived either directly from the
provided imagery or from the standard military product VMAP
Level 1, if available for the area of interest.
The collateral extraction products that are derived are as
follows:
•
•
•
•
Water mask;
NDVI (and vegetation mask);
Image texture;
VMAP junction locations.
Water and vegetation masks are instrumental in refining the
classification of linears, as typically noise features in rural
areas may be misclassified as water features, while small-scale
vegetation features such as hedgerows may be classified as
roads. By applying water and vegetation masks, misclassified
linears can be identified and removed.
A limitation of the classification process is the difficulty
in discriminating between road and railway features. Both appear radiometrically similar in panchromatic and multispectral optical imagery. However, a measure of image texture at
high resolution can aid discrimination, as road tarmac is typically more homogenous in appearance than the combination
of rail, sleeper, and gravel forming the railway line. The texture measure is used as a discriminant input into the classifier, as discussed in a section to follow.
In some areas of interest, the standard military vector
product VMAP Level 1 is available. If so, this can yield useful
information, despite the fact that its accuracy (equivalent to
1:250000 map scale) is insufficient for direct comparison with
the imagery, particularly at high-resolution. In the case of topologically structured datasets, the key nodes (e.g., junctions,
bridges, and intersections) are likely to be more spatially accurate than the links. Therefore, all the nodes in the dataset are
parsed, and those representing features of interest to the user
are stored for later use. The links associated with those nodes
are also stored, so the topological graph of the VMAP network
can be traversed. This graph is then used as a guide during the
network build process.
Linear Extraction
The automation of linear extraction generally takes one of two
basic approaches: identification then extraction, or extraction
then identification. The first of these two approaches is essentially an automated form of the manual digitising process
using line following or snake algorithms as previously discussed. Such approaches can be very sensitive to local image
properties often failing to detect any linears at all in some parts
of an image, and are often tuned to a single feature type. The
ALFIE approach overcomes these limitations by employing the
second extraction method. In this case, every linear feature
that can be detected is extracted as an unclassified primitive,
an example using a Russian KVR image with GSD of approximately 2 m being shown in Figure 2. While this exhaustive
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
LFX-551.qxd
11/9/04
16:25
Page 1377
(a)
(b)
Figure 2. Typical output from the linear extraction module: (a) Original KVR image of Worcester, UK, and (b) Linears
extracted from area shown in (a).
method is expensive in terms of CPU time and data volume,
this method ensures that all features of different classes are extracted, with the identification (or classification) of these features being performed later. This method is only made possible by using the OODB as a storage medium, and by using
context in the identification process.
The output illustrated is from the Linefinder algorithm, a
centreline detector using a Marr-Hildreth filter, which is optimised for use with high-resolution imagery. The algorithm is
particularly good at extracting broad features in less-urbanized
regions, but suffers from fragmentation, requiring adjacent
primitives to be grouped into longer, more complete features.
The fragmentation of features depicted in Figure 2 is typical
of a linear extraction from satellite imagery, whereby extraction quality is heavily affected by other objects in the scene.
This is a particular problem in the urban area where object
variability is high and a large number of occlusions are present
(due to building overhang, shadow, or on-road features, such
as vehicles).
The degree of connectivity within the dataset can be improved by grouping the individual primitives. In order to join
these disconnected primitives a snapping procedure, summarized in Figure 3, is applied across all the extracted linear objects. A radial neighbourhood at the endpoints of each linear
Figure 3. Snapping and cleaning process.
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
object (source object) is examined. If one or more linear objects (target objects) are found within this search radius, a
matching-algorithm is applied that considers the spectral
properties of the original raster image and the orientation of
the target and source objects. Source objects are matched
with the target object whose spectral signature and orientation are most similar. If this similarity falls within a pre-defined tolerance, the two objects are merged into one. If no adjacent features are found within the search radius, then the
feature is deemed a candidate for removal. The removal of
very small, isolated primitives reduces the level of noise in
the dataset.
The grouping procedure is carefully controlled, with sensible search radii and strict similarity tolerances used, in
order to prevent unreasonable snapping from occurring. Unconstrained snapping may allow primitives representing two
completely different features, or separated by too great a distance, to be joined. Grouping is performed iteratively to allow
features to be built up progressively, until a pass yields no further changes to the dataset.
Classification
The output from the linear extraction stage is an entirely unclassified set of linear primitives. The next stage is to derive a
classification into one of the main functional groups (singlecarriageway road, dual-carriageway road, river, or railway)
and to build up the primitives into a topologically structured
network.
The database schema provides information on every linear object through database methods. These methods are invoked by the classification module to build up a knowledge
base comprising the information about each linear object as
described in Table 1. Grouped linear objects are more straightforward to classify than heavily fragmented primitives, as
metrics such as sinuosity are unrepresentative of the feature
over very short distances. Engineering rules would suggest
that the elevation of the underlying terrain should be a highly
significant discriminator; for example, there are limitations in
the acceptable gradient of a railway line, and rivers must have
a consistent decline in elevation in order for the water to flow.
However, the elevation information derived from the standard
December 2004
1377
LFX-551.qxd
11/9/04
16:25
Page 1378
military product DTED, even at Level 2 (30 m post spacing), is
not of sufficient resolution to provide the level of detail required to quantify the terrain.
A number of techniques have been investigated as possible means for classifying the linears based on their attributes,
from a straightforward multi-criteria sieving methodology
using predefined engineering rules, through to a complex belief network. The chosen classification method should be
fuzzy and provide some measure of probability for the accuracy of the outcome. For the classification of the linear features in ALFIE, a Cluster-Weighted Model (CWM) has been developed. The CWM is a Bayesian probabilistic model with a
fixed architecture, but with flexibility in model parameters
(Gershenfeld, et al., 1999). This combines the flexibility of a
Gaussian mixture model with the benefits of using a general
linear model to yield real or discrete valued outputs. The output from the CWM is a straightforward probability table, which
has as many columns as there are discrete valued dimensions
(see Equation 1). These discrete dimensions correspond to the
database methods determined to be significant discriminators.
Those attributes found to be most effective in discrimination
were sinuosity, width, variation in width, dominant spectral
value, and the variation in spectral value.
p(y x, k) p(y k)
p(y1 1 k)
p(y1 1 k)
o
p(y1 L k)
p(y2 1 k)
p(y2 2 k)
o
p(y2 L k)
p
p
∞
p
p(yM 1 k)
p(yM 2 k)
o
p(yM L k)
(1)
where x is the set of input features, y is the set of output features, k is the set of clusters, L is the number of different
classes and M is the number of discrete output dimensions.
Every column in the probability table sums to unity.
The CWM is trained using a manually created truth dataset
representing a typical set of features where class membership is known. This is used first to determine the number of
clusters to use for optimum output (the model order), and
second to locate the position of those clusters within the
multi-dimensional object space. A bootstrap test is used to
determine the number of clusters, and then the truth data are
compared with the output estimates. By adjusting the number
of dimensions within the model, and the properties that these
dimensions represent, a “best” performing model was created.
As the properties of linear features are, to a certain degree, domain specific, a model was selected that performed most effectively for all context regions within the area of interest. A
useful extension to the model would be the development of
different classifier models for different context regions.
Network Building
Following the initial classification process is an iterative network build and classification refinement. This stage takes the
thousands of linear primitives and the junction locations
taken from VMAP and uses pattern-matching techniques to determine the important nodes in the extracted dataset. Network
building follows this hierarchical approach by attempting to
construct the major network structure first.
The junction coordinates from the VMAP data are used to
seed a junction building process, starting with the most significant (e.g., motorway intersections). The process begins with a
radial search in the extracted dataset about each junction for
all linear objects that intersect the search radius. Pattern
matching algorithms are then applied to establish which of
these linears most likely represent the arms of the junction.
In order to prepare for the pattern matching the collection
of extracted linears is filtered. Linears below a threshold
1378
December 2004
(a)
(b)
Figure 4. Detecting junction entrants using VMAP: (a) that
start or terminate within the search radius; (b) that bisect
the search radius.
length are removed. Linears that intersect the search radius
more than twice are removed. Linears whose start and end
points are contained within the radius are removed. This
leaves just two types of linear within the collection: (a) those
that start or terminate within the search radius, end-point linears and (b) those that bisect the search radius, bisecting linears, illustrated in Figure 4. By filtering the linears, the number of potential junction entrants is reduced to a manageable
level. Linears representing the features entering a junction are
most likely to be of a reasonable length and begin within the
radius or pass right through.
The first step in the pattern matching process is to determine the positions of potential junctions. Given that the true
number of junction arms A is known from the VMAP data, all
combinations of extracted linears that can represent a junction
with A arms need to be established. For this purpose a score
is assigned to each of the linears remaining after the filter
process: 2 for bisecting linears, (since these linears will represent two arms of a junction, see Figure 4) and 1 for end-point
linears. For example a junction with four arms (A 4) can be
comprised of two bisecting linears, or one bisecting linear and
two end-point linears, or four end-point linears. Properties of
the linears such as their estimated class, orientation, and pattern are analysed, and the linears most consistent with VMAP
junction are built together into a junction object.
After determining the locations of critical junctions and
intersection nodes within the extracted linears, the interconnectivity between them is structured using the iterative approach shown in Figure 5. A regional approach is used,
whereby every junction is visited in turn, and its connectivity
to every other junction in its vicinity is assessed. The junction
entrants are examined pairwise, and entrants of the same class
are identified. A least-cost routing method is then used to fit a
corridor between them. Cost is calculated from the radiometric
values of the pixels between the junction points, with pixels of
Figure 5. Iterative network building process.
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
LFX-551.qxd
11/9/04
16:25
Page 1379
Figure 6. Corridor building between two junctions.
similar radiometric values to the feature of interest being
given low cost. Linears already extracted are burnt into the
image as areas of very low cost as they are most likely to represent the desired route between the two points (see Figure 6).
If the cost of the created corridor is above a predefined threshold, or the corridor exhibits unexpected properties, such as
a sudden change in radiometric value, then it will be discarded.
This iterative process continues by detecting additional
junctions. This includes those present in the VMAP dataset,
that have not been successfully detected in the extracted
dataset, and further junctions not present in the VMAP dataset.
Corridors are then built out from these junctions in order to
complete the network further. Junctions not present in the
VMAP dataset are identified by using pattern-matching techniques similar to those employed with a known VMAP junction location. In this case, endpoints of significant linears are
used as the seed locations, around which a radial search is
performed. Any linears falling within the search radius are assessed for their classification, similarity, pattern, and orientation. If sufficient high-scoring linears are found to converge
upon a point, then a potential junction is flagged at this location. This potential junction is used in the corridor building
process, although its thresholds and constraints are higher
than junctions detected using VMAP.
The pattern matching technique used is quite basic and
sensitive to the quality of the local primitives, however, it is effective at detecting the location of distinct converging linears.
A number of authors have described more complex methods,
such as the Connective Hough Transform (Yuen, et al., 1993)
or probabilistic relaxation approach (Matas and Kittler, 1993),
which could be employed at this stage. Alternatively, Lindeberg (1998) describes a multi-scale approach to junction detection directly from raster images, which could be used at the
collateral extraction stage to provide an additional junctions
dataset.
Validation
The final stage of processing provides a means for the end
user to determine the success of the process flow and the
completeness of the output. The validation component of
the ALFIE system comprises a set of database methods allowing quantitative and qualitative assessment of the results.
The validation methods available are summarised in Table 3.
Display methods show graphically the areas of the network
that are incomplete or incorrect, while process methods
parse the complete dataset and provide a set of summary
statistics.
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
TABLE 3.
VALIDATION METHODS
Validation Metric
Type of Validation
Completeness of network
Confidence of classification
Correctness of classification*
Correctness of classification (by class)*
Reclassification during network build
Final classification of network
Display and Statistic
Display and Statistic
Display and Statistic
Statistic
Display and Statistic
Display
*truth data must be available in order to use these validation metrics.
Several methods could be regarded as offering an internal
validation and require no reference truth dataset. One such
example relates to the completeness of the network with
respect to the generalized VMAP network used in junction
seeding. These statistics and graphics effectively present the
degree to which the connectivity of the VMAP graph is replicated in the extracted graph. Although VMAP is used for seeding junctions, the network building will not always be able to
complete network elements between any two seeded junctions, possibly due to the failure to detect one of these junctions in the extracted graph.
Other methods report on the classification of linears, beginning with a graphical display of the level of confidence of
each linear in its own classification, based around output
from the classifier. Any reclassification of linears during the
network building stage can be highlighted, and the final classified vector map displayed.
The object-oriented data model being employed allows
for more dynamic validation at the object level and offers an
area for continued research. The degree to which a line or
junction forms part of an overall network could be a valuable
validation procedure, and a possible mechanism for this
could be the use of agent methods. Objects being aware of
their place in a network has clear parallels with intelligent
map generalisation (Lamy, et al., 1999). A junction may be locally connected and formed by a legal configuration of entrant
linears, but in terms of its wider connectivity, it would be useful if agent methods (or other appropriate techniques) could
establish whether entrants at a particular junction connected
to other junctions. An index of connectivity could be calculated and displayed for each junction that conveys the degree
to which it is fully connected.
Two final methods calculate the correctness of the classification; this absolute assessment of accuracy requires a truth
dataset against which to compare the extraction. Therefore,
December 2004
1379
LFX-551.qxd
11/9/04
16:25
Page 1380
these methods are designed to test the system conceptually
during development, but would be inappropriate for use with
an operational system.
The effective graphical representation of validation information has proved important during system development in
addition to its value for an operational system. The flexibility
to graphically roll back to any stage of the processing offers a
tool for exploring the output from each control module.
Results and Discussion
Results
The project aim was to demonstrate a flexible and modular
framework for the automated extraction of geospatial information from imagery. A complete process flow has been presented with each stage of processing calibrated and automated
by the overarching control strategy. Contextual information is
used to provide knowledge for discriminating between the
different classes of linear feature. Table 4 presents a confusion
matrix for the initial classification of features for the test area
illustrated in Figure 7. Table 5 details the overall classification
success for both urban and rural areas before and after the network building procedure.
The quality of the final output from the ALFIE system, in
terms of network completeness and classification accuracy, is
TABLE 4.
Actual
Extraction
Result
Feature
CLASSIFICATION CONFUSION MATRIX
Road—Dual
Carriageway
Road—Single
Carriageway
Railway
River
80%
0%
0%
0%
15%
56%
31%
0%
5%
0%
31%
12%
68%
1%
3%
97%
Road—Dual
Carriageway
Road—Single
Carriageway
Railway
River
heavily dependent upon the quality of the initial extraction.
The algorithms currently employed by ALFIE, which were considered to be the most appropriate available at the time, are
not able to produce perfect output, particularly in terms of
maintaining the connectivity of the linear network.
Figure 7 shows some sample output from the complete
ALFIE process flow for a test area in Worcestershire, England
compared with a manually-produced truth dataset. Although
the linear network produced by the ALFIE process is significantly less complete than both the truth dataset and the VMAP
vector product, it is a potentially more precise, spatially accurate, and topologically correct representation of the detected
real-world features in the area due to the relatively high spatial resolution of the source imagery. As can be seen, the
major road intersection in the north-east of the region has
been extracted particularly well. Additionally, the railway
cutting through the image from north to south is detected
well, although some confusion can be seen where multiple
TABLE 5.
OVERALL PERFORMANCE OF CLASSIFICATION
AND NETWORK BUILD PROCESSES
Criterion and Context Region
Classification
Confidence
Classification
Accuracy
VMap Junctions
Detected
Linears Extracted
Network Extracted
Network
Connectivity
Pre-Network
Build
Post-Network
Build
89%
89%
78%
52%
n/a
n/a
72%
37%
18%
16%
n/a
n/a
100%
100%
99%
96%
53%
27%
n/a
n/a
70%
21%
65%
60%
Rural
Urban
Rural
Urban
Rural
Urban
Rural
Urban
Rural
Urban
Rural
Urban
(b)
(a)
Figure 7. (a) True transport network; (b) ALFIE extracted network.
1380
December 2004
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
LFX-551.qxd
11/9/04
16:25
Page 1381
tracks run parallel to each other. The river feature in the center is extracted well in rural areas but is lost as it enters the
urban area. Two examples of erroneous extractions are highlighted in Figure 7b. The feature running across the center of
the area illustrates how a series of minor road segments have
been linked together to form an extracted single carriageway
road. The large river towards the left of the area is not extracted due to its width exceeding the threshold of the extraction algorithm, however, the river is extracted as part of the
water mask as previously described. Extraction of the network
within the urban area is less effective than within the rural
area. Although the extraction algorithm detects a significant
percentage of the linear features, their appearance is too fragmented for a successful network build. Furthermore, the
extraction of building line edges and other urban clutter reduces the discriminatory capability of the classifier.
In areas of clean extraction, the classification and network
building processes works well. The algorithms currently available within ALFIE are less effective in extracting linear networks within the urban area. However, the ALFIE approach is
designed to be future proof, so that new developments in linear extraction algorithms can be added when they become
available without affecting the overall structure of the system.
Implications of the Research
The digitisation of geospatial data and the extraction of information from imagery are manually intensive, time-consuming
tasks. The drive to develop automated techniques to extract
this information spans all areas that can benefit from geospatial data. The impetus for the ALFIE project was the military
requirement for the rapid and cost-effective generation of
tailored geospatial data for mission planning and rehearsal.
However, the techniques employed are equally applicable to
any application, military or civilian, requiring geospatial data
and information extracted from image sources. A particular
application of interest is the updating of cartographic databases for national map revision.
The control strategy approach provides flexibility in operational requirements. This allows a priori information to select processing techniques to optimise data capture, ensuring
output is tailored to the application requirements. Furthermore, its modularity provides a means of future-proofing,
whereby new technologies and algorithms can be added at a
later date.
On-Going Research
Further evaluation of the system against alternative methods,
including manual digitising, would be beneficial, adding
speed and ease of use to the validation metrics referred to in
Table 3. Many aspects of the research at a control module
level offer opportunities for on-going research. These include
the incorporation of paired outlines from edge extractors in
addition to the centerlines currently extracted by the ALFIE
system. The way in which contextual information is translated into inputs to the classifier, and the architecture of the
classifier itself, are subjects for future study. A follow-up project ALFIE-2 is extending the ALFIE system to 3D and areal feature extraction. This information can then be used to improve
the classification of features by providing additional contextual cues.
Extending the ALFIE approach to incorporate 3D features
brings a number of other benefits for SNE database generation:
•
•
Significant value can be added to the data when compared to
standard products such as VMAP. For example, 3D topology
can be constructed at the time of extraction representing over/
underpasses and ensuring consistency with the underlying
terrain.
Up-to-date terrain data can be extracted from the same imagery used to extract the linear and areal features with the
terrain extraction tailored to specific requirements.
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
•
The direct extraction of 3D information (as opposed to the use of
existing Digital Elevation Model products such as DTED, which
are limited by resolution) provides additional contextual clues,
for example gradients, which can improve the classification of
features. Groupings of buildings can be more reliably discriminated and therefore be included as local contextual clues within
the classification process for linear features such as roads. The
improved classifications can in turn refine the accuracy of the
feature matching process required to extract the 3D information.
There is therefore a positive feedback loop.
Conclusions
A framework has been presented which supports the rapid
generation of transport networks from imagery. A control
strategy manages a process flow of information with control
modules, utilizing the power of an underlying object-oriented
geospatial database. The architecture has been designed with
future developments in mind, being based around a toolkit of
available algorithms and imagery. The system is flexible and
extensible using the modular addition of any new algorithms
or techniques, ensuring that the overall framework of the
process flow remains the same.
The modular ALFIE system has proved a powerful development environment, allowing the recognition process to be
followed through and sensitivities within and dependencies
between modules studied.
The control strategy has helped to manage the complexity
of the problem and has allowed contextual information to be
incorporated in various ways throughout the process flow.
The adoption of an object-oriented geospatial database has allowed complex discriminating characteristics of objects to be
dynamically extracted, effectively enabling objects to classify
themselves.
SNE databases require much more than 2D linear networks
and so the research is currently being extended to investigate
the extraction of areal and 3D features. This extension to incorporate 3D information also offers powerful additional contextual clues, which can be used to improve the classification
of linear networks.
References
Abramovich, T., and A. Krupnik, 2000. A quantitative measure for the
similarity between features extracted from aerial images and road
objects in GIS, International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B3, Amsterdam 2000, unpaginated CD-ROM.
Baumgartner, A., W. Eckstein, H. Meyer, C. Heipke, and H. Ebner, 1997.
Context-Supported Road Extraction, Automatic Extraction of ManMade Objects from Aerial and Space Images (II) (A. Gruen, E.P.
Baltsavias, E.P. and O. Henricsson, editors), Birkhauser Verlag,
Basel, pp. 299–308.
Bordes, G., G. Giraudon, and O. Jamet, 1997. Road Modelling based on
a cartographic database for aerial image interpretation, Semantic
Modelling for acquisition of topographic information from images
and maps (W. Förstner, and Plümer, editors), Birkhauser Verlag,
Basel, pp. 123–139.
Doucette, P., P. Agouris, A. Stefanidis, and M. Musavi, 2001. Selforganised clustering for road extraction in classified imagery,
ISPRS Journal of Photogrammetry and Remote Sensing, 55:
347–358.
Forghani, A., 2000. Semi-automatic detection and enhancement of
linear features to update GIS files, International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B3, Amsterdam 2000, unpaginated CD-ROM.
Gershenfeld, N., B. Schoner, and E. Metois, 1999. Cluster weighted
modeling for time series analysis, Nature, 397.
Gruen, A., and H. Li, 1995. Road extraction from aerial and satellite
images by dynamic programming, ISPRS Journal of Photogrammetry and Remote Sensing, 50(4):11–20.
December 2004
1381
LFX-551.qxd
11/9/04
16:25
Page 1382
Hinz, S., A. Baumgartner, H. Mayer, C. Wiedemann, and H. Ebner,
2001. Road extraction focussing on urban areas, Automatic Extraction of Man-Made Objects from Aerial and Space Images (III)
(E. Baltsavias, A. Gruen, and L. Van Gool, editors), Balkema,
Lisse, pp. 255–266.
Hofmann, P., and W. Reinhardt, 2000. The extraction of GIS features
from high resolution imagery using advanced methods based on
additional contextual information—first experiences, International Archives of Photogrammetry and Remote Sensing, Vol.
XXXIII, Part B4, Amsterdam 2000, unpaginated CD-ROM.
Lamy, S., A. Ruas, Y. Demazeau, M. Jackson, W.A. Mackaness, and R.
Weibel, 1999. The Application of Agents in Automated Map Generalisation, Proceedings of the 19th ICA/ACI Conference, Ottawa,
Canada, pp. 160–169.
Linderberg, T., 1998. Feature detection with automatic scale selection,
International Journal of Computer Vision, 30(2):79–116.
Matas, J., and J. Kittler, 1993. Junction detection using probabilistic
relaxation, Image and Vision Computing, 11(4).
McKeown, D., W. Harvey, and J. McDermott, 1985. Rule-based interpretation of aerial images, IEEE Transactions on Pattern and Machine Intelligence, PAMI-7 (5):570–585.
NIMA, 1995. Vector Map (VMap) Level 1 Specification, URL:http://
earth-info.nga.mil/publications/specs/printed/ 89033/
VMAP_89033.pdf (last date accessed: 23 September 2004).
Priestnall, G., and R. Glover, 1998. A Control Strategy for automated
land use change detection: An integration of vector—based GIS,
remote sensing and pattern recognition, Innovations in GIS 5
(S. Carver, editor), Taylor and Francis, London, pp. 162–175.
Priestnall, G., and S. Wallace, 2000. Semi-automated linear feature extraction using a knowledge rich object data model, International
Archives of Photogrammetry and Remote Sensing, Vol. XXXIII,
Part B3, Amsterdam, 2000, unpaginated CD-ROM.
Sonka, M., V. Hlavac, and R. Royle, 1999. Image Processing, Analysis
and Machine Vision: (2nd Edition) Brooks/Cole Publishing,
828 p.
Steger, C., H. Mayer, and B. Radig, 1997. The role of grouping for
road extraction, Automatic Extraction of Man-Made Objects
1382
December 2004
from Aerial and Space Images (II) (A. Gruen, E.P. Baltsavias,
and O. Henricsson, editors), Birkhauser Verlag, Basel, pp. 245–
256.
Stilla, U., and E. Michaelsen, 1997. Semantic modelling of man-made
objects by production nets, Automatic Extraction of Man-Made Objects from Aerial and Space Images (II) (A. Gruen, E.P. Baltsavias,
and O. Henricsson, editors), Birkhauser Verlag, Basel, pp. 43–52.
Teoh, C.Y., and A. Sowmya, 2000. Junction extraction from high resolution images by composite learning, International Archives of
Photogrammetry and Remote Sensing, Vol. XXXIII, Part B3,
Amsterdam 2000, unpaginated CD-ROM.
Tonjes, R., and S. Growe, 1998. Knowledge Based Road Extraction
from Multisensor Imagery, Proceedings of the ISPRS Symposium
on Object Recognition and Scene Classification from Multispectral and Multisensor Pixels, Commission III, Working Group 4,
06–10 July 1998, Columbus, Ohio, USA.
Vosselman, G., and J. de Knecht, 1995. Road tracing by profile matching and Kalman filtering, Automatic Extraction of Man-Made objects from Aerial and Space Images (A. Gruen, O. Kuebler, and P.
Agouris, editors), Birkhauser Verlag, pp. 265–274.
Wallace, S.J., M.J. Hatcher, R.G. Ley, G. Priestnall, and R.D. Morton,
2001. Automatic differentiation of linear features extracted from
remotely sensed imagery, Österreichische Zeitschrift für Vermessung und Geoinformation, Heft 34:17–29.
Wang, Y., and J. Trinder, 2000. Road network extraction by hierarchical grouping, International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B3, Amsterdam 2000, unpaginated CD-ROM.
Wiedemann, C., 1999. Completion of automatically extracted road networks based on the function of roads, Automatic Extraction of GIS
Objects from Digital Imagery (H. Ebner, W. Eckstein, C. Heipke,
and H. Mayer, editors), International Archives of Photogrammetry
and Remote Sensing (32) 3-2W5.
Worboys, M.F., 1994. Object-oriented approaches to geo-referenced
information, International Journal of Geographical Information
Systems, 8(4):385–399.
Yuen, S.Y.K., T.S.L. Lam, and N.K.D. Leung, 1993. Connective
Hough Transform, Image and Vision Computing, 11(5):295–301.
P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G