LFX-551.qxd 11/9/04 16:25 Page 1373 A Framework for Automated Extraction and Classification of Linear Networks G. Priestnall, M.J. Hatcher, R.D. Morton, S.J. Wallace, and R.G. Ley Abstract This paper presents a framework for extracting networks of linear features such as roads from imagery using an objectoriented geodata model. The proof of concept approach has resulted in the Automated Linear Feature Identification and Extraction (ALFIE) which uses a control strategy to automate the process flow. The resulting system is highly flexible, incorporating a toolkit of algorithms and imagery to extract linear features and utilizes contextual information to allow evidence of class membership to be built up from a variety of sources. The classification algorithm employs a Bayesian modelling approach. This incorporates both geometric and photometric information of which five key discriminators were identified: width, width variation, sinuosity, spectral value, and spectral value variation. This paper presents an in-depth discussion of the processes undertaken by the ALFIE system and quantitative results of the final output from the system in terms of classification accuracy and network completeness. Introduction Linear features such as road networks are fundamental components of many vector GIS databases and are increasingly required for a range of multi-dimensional modelling applications. Although such features are recognizable by humans in the majority of medium to high resolution remotely sensed imagery, the task of algorithmically discriminating between roads and other linear features observable in imagery is complex and calls for an approach based upon objects rather than pixels. The properties of objects and their placement within the wider scene must be considered in order to attempt to utilize some of the contextual knowledge used by humans. This paper presents an approach to managing the complexity of this recognition problem which involved the development of a flexible and extensible system set within an object-oriented spatial database environment. The Automatic Linear Feature Identification and Extraction (ALFIE) project was led by QinetiQ© (formerly the British Government’s Defense Evaluation and Research Agency) and involved the School of Geography at the University of Nottingham and Laser-Scan Limited. It was funded by the UK Ministry of Defense Corporate Research Programme. The project was driven by the need to rapidly populate military Synthetic Natural Environment (SNE) databases for use within simulations. Standard military datasets are typically used to provide the bulk of the data for an SNE database. However, G. Priestnall is with the School of Geography, University of Nottingham, Nottingham, NG7 2RD, UK (gary.priestnall@ nottingham.ac.uk). M.J. Hatcher, S.J. Wallace, and R.G. Ley are with the Space Department, QinetiQ Ltd., Cody Technology Park, Ively Road, Farnborough, Hampshire, U14 0LX, UK. R.D. Morton is with Laser-Scan Ltd., Cavendish House, Cambridge Business Park, Cambridge, CB4 0WZ, UK. P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G such datasets may not be available for the specific area of interest, may be at an inappropriate scale, may require augmentation and filtering, and may be based on out-of-date mapping. The requirement therefore exists to generate tailored, up-todate geospatial data in a cost effective manner. The geospatial data and attribute requirements for such databases are varied and include terrain models, as well as, vector datasets. Of the latter, linear networks are important elements. It is in this area (i.e., the extraction of linear features including, but not limited to, roads) that the initial ALFIE research has concentrated to enable the devised framework to be proven. Although the requirements for the research have been driven by military SNE databases, the strategy presented has direct implications for operational automated map production and revision systems. Commercial satellite imagery, which is now readily available for all areas of the world, provides an important source of geospatial data for the production of SNE databases, however efficient operational strategies for extracting feature information from these images need to be developed. The ALFIE project addresses this capability gap. The ALFIE approach is based upon a number of fundamental concepts: • • • • • The use of imagery of different spatial and spectral resolutions; The exploitation of pre-existing linear extraction algorithms; Development within an object-oriented framework; Improving the intelligence of the extraction through the use of spatial and aspatial context; Automating the complete extraction and classification process through a control strategy. The ALFIE system has been designed to accommodate knowledge of a range of source imagery and linear feature extraction algorithms. The availability of various types of imagery in different parts of the world will vary, and therefore, there is a desire to automate the selection of the most appropriate image and algorithm combination. An important element of the approach is to allow the inclusion of new imagery and algorithms into this knowledge base in the future. The imagery of interest includes both panchromatic and multispectral optical datasets with spatial resolutions ranging from 30 m to 1 m. In previous research (Priestnall and Wallace, 2000; Wallace, et al., 2001) the importance of contextual information and the use of an object database were presented. In this paper, the details of an entire framework for linear feature extraction are presented, and the importance of a control strategy to automate the many component modules is emphasised. Photogrammetric Engineering & Remote Sensing Vol. 70, No. 12, December 2004, pp. 1373–1382. 0099-1112/04/7012–1373/$3.00/0 © 2004 American Society for Photogrammetry and Remote Sensing December 2004 1373 LFX-551.qxd 11/9/04 16:25 Page 1374 Approaches to Linear Extraction The Nature of the Problem The major requirement for the ALFIE system is the rapid generation of linear networks from whatever source imagery is available. As with the majority of applications utilizing such spatial information, the most useful form for the end product, in terms of geometry, is a structured network of vector centerlines. The ability to selectively display objects, change their representation (e.g., symbology and color), and add attributes is particularly important if the linear features are going to contribute towards visualization and modelling environments. The problem at hand therefore is to extract discrete objects that will ultimately have attribution, topological structure and be part of an inter-related group of geographical objects. The emphasis for Commercial Off The Shelf (COTS) software has been pixel-based, with the object-based approaches being largely within the research domain. For example, Hofmann & Reinhardt (2000) discuss the prospects for land cover classification based around segmented objects rather than pixels. The extraction of discrete objects from image data is essentially a machine vision or more precisely an image understanding problem common to many domains including industrial inspection and military target recognition (Sonka, et al., 1999). The extraction of geographical objects from remotely sensed imagery could be considered more challenging than many other domains for reasons that include: • • • A wide variety of image characteristics (including spatial and spectral resolution); With industrial applications the inspection is terrestrial and at close range and therefore the environment can be controlled. In comparison, use of airborne or satellite imagery brings the atmosphere into play and this can have significant effects on the representation of the features within the imagery; The characteristics of geographic objects and their inter-relationships vary enormously. Many image understanding applications utilize the attributes of objects from an early stage in the extraction process to establish evidence for classifying those objects. Image processing algorithms extract low level primitives of various kinds which form the initial set of unclassified objects. For example, although road features are usually linear when considering the target vector dataset, the nature of these objects in the image means that several approaches to initial extraction arise, namely the extraction of the edges of the road, the region defining the road surface, or the centreline of that road feature. For each technique the optimisation of the extraction results is an active area of research, and the advent of finer resolution satellite imagery presents ever more complex challenges. For the extraction of edges, many combinations of edge detection filters and pre-processing techniques can be considered (Forghani, 2000). For such edges to be useful, techniques must be developed to recognise pairs of matching edges or groupings of edge representing features such as junctions (Teoh and Sowmya, 2000). As image resolutions become finer, linear features can be considered as elongated areas and region-based analysis implemented, for example, by a selforganizing map (Doucette, et al., 2001), or by a semantic model of the linear structure (Hinz, et al., 2001). Degrees of Automation The challenge of image understanding could be regarded as one of varying complexity when viewed across a wide range of disciplines. When the object or pattern has quite a predictable shape, size, and type, then reliable total automation can be achieved. With geographical imagery however, the characteristics of objects within the image scene can vary 1374 December 2004 enormously. The problems of developing transferable rules for automated object extraction have been recognised for many years (McKeown, et al., 1985). As a result of geographical objects being so variable, attempts to extract them in a totally automated fashion have been largely unsuccessful unless restrictions are placed upon the source image type or the characteristics of the target object. Even within the field of data capture from scanned paper maps, where objects follow certain rules of cartographic representation, user guidance has been necessary to enable successful object extraction in a production environment. Semi-automated approaches often involve the manual identification and seeding of a certain type of object, such as a road, the geometry of which is then extracted by a sequential line following algorithm (Vosselman and de Knecht, 1995) or the active curve fitting approach of snakes (Gruen and Li, 1997). An alternative approach is to reduce the search space for objects by using existing map data to guide the extraction process (Bordes, et al., 1997). Such approaches must address issues of cartographic generalisation and in particular the degree to which positional information can be relied upon (Priestnall and Glover, 1998; Abramovich & Krupnik, 2000). The Introduction of Context Attempts to increase the level of automation may utilize some of the contextual information which humans employ when interpreting an image. The placement of an object within the wider scene and its inter-relationships with other objects at a range of scales would constitute general contextual knowledge (Priestnall and Wallace, 2000). When putting these broad concepts into practice, more specific mechanisms for representing contextual clues are described. Contextual regions and local rule-based sketches (Baumgartner, et al., 1997) represent different levels of spatial context. Containment within broad land use regions influences the type of object patterns observed, and at the local level certain rules can describe commonly observed inter-relationships between objects of different types. Local relationships between roads and linear groupings of buildings are presented by Stilla and Michaelson (1997). In addition to knowledge contained within a single scene, collateral evidence from other imagery can be considered (Tonjes and Growe, 1998). When classifying linear features such as roads, characteristics including the reflectance properties and the geometric pattern of the extracted linear primitive are taken into account (Vosselman and de Knecht, 1995). The importance of establishing a wider network of topologically connected objects, as recognised by Steger, et al. (1997), could be regarded as introducing one aspect of contextual knowledge into recognition strategies. Wang and Trinder (2000) illustrate the use of simple rules to eliminate non-road features that are isolated from the wider network after hierarchical grouping of line segments. The concept of a wider, connected, functional network can be used to build up and evaluate networks in an automated fashion where high quality extraction results and reference networks are available (Weidmann, 1999). A Framework for Managing Contextual Information In order to begin to exploit contextual knowledge and to provide a high degree of automation, a framework must be designed that is powerful, flexible, and extensible. The complex nature of object properties and inter-relationships suggests the use of an object-oriented geospatial database. The requirement for rapidly extracting linear networks utilizing a range of both source imagery and extraction algorithms with a high degree of automation has led to the development of a software control strategy to manage the complexity of the whole P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G LFX-551.qxd 11/9/04 16:25 Page 1375 process. The following section presents the control strategy developed within the ALFIE project. Control Strategy The aim of the ALFIE project was to establish a framework that facilitates the automatic extraction, identification, and attribution of linear features from imagery available at the time. The resulting system adopts an approach to automation that minimizes human interaction up to a final editing stage. A sequence of operational modules referred to as the process flow allows the basic principles of the ALFIE approach to be implemented. Decisions are required at each stage as to the appropriate actions to be taken, and to replace human input into this decision making process a control strategy is required. The core software component of the control strategy, termed the control interface, utilizes lookup tables to initiate the most appropriate action at each stage and controls information flow between these stages, termed control modules. Figure 1 summarises the operations required to implement the automated procedure and their organization into a process flow of control modules, described in detail in a section to follow. The preparation stage involves automated selection of available imagery, linear extraction algorithm, and appropriate parameters. The combination of image and algorithm selected determines the nature of pre-processing undertaken in the next stage. Contextual information is utilized in several ways beginning with the influence of broad landscape regions upon the choice of extraction technique and parameters. During the collateral extraction stage, supporting contextual evidence is gathered to refine later classification of linears and to limit the search space for network construction. This includes the capability to utilize generalized network information provided by the standard military vector product VMAP (NIMA, 1995) where this is available; however, the system is not reliant on this data. Centerlines of linear features in the image are extracted and vectorized, remaining unclassified primitives at this stage. Certain contextual filtering based upon water and vegetation masks removes some primitives, and local snapping attempts to build longer linear objects. The classification stage involves attributing each linear object with geometric and photometric properties that are then used to discriminate between roads, railways, and rivers using a cluster-weighted model. A structured network of linear features is important; therefore, operations to build a topological network are initiated. Validation statistics and graphics enable the user to assess the degree of success before a final user-guided editing stage. The entire procedure is based around the idea of having vector objects, the properties of which provide discriminating evidence that becomes attribution to aid the classification of that object. Incorporation of contextual information both locally and regionally and the operations to construct a more globally consistent network are fundamental. In order to implement this a requirement exists for a powerful and flexible spatial database development environment to allow potentially complex object schema to be designed and appropriate functionality to be associated with different classes of object. Object-Oriented Database Technology One approach to incorporating contextual information is to build it into the extraction algorithm. This may be appropriate if a single algorithm is used, but it significantly reduces the flexibility of a framework where a toolkit of algorithms is required to handle a variety of image types and resolutions. A more flexible approach is to divorce the contextual rules from the extraction process. Unclassified linear objects then require a technique to store complex attribution for each object representing its characteristics and context within the wider scene. Object-oriented geospatial databases offer a good platform on which to develop such techniques. The Object-Oriented (O-O) approach has prevailed for some time in computer science and offers a logical and flexible way of modelling the real world (Worboys, 1994). Following this approach, real world entities can be abstracted and held as objects of certain types, each type being characterised by certain attributes and operations. When implemented in code, each object type becomes a class with attributes represented by data types ranging in complexity from simple integers to spatial geometries. The operations encapsulated within each object class are implemented as software methods. Object-Orientation has been fundamental in the approach taken to linear classification by the ALFIE project. Extracted linear features are maintained as objects within the LaserScan O-O spatial database. By defining suitable methods it becomes possible to interrogate primitive linear objects for contextual information that can be used for their classification. Table 1 lists the methods defined to assist the classification of linear objects within the ALFIE system. These value methods dynamically extract attributes from both source image and extracted linear vector primitives. As this information is derived TABLE 1. VALUE METHODS USED TO REPRESENT GEOMETRIC, PHOTOMETRIC AND ELEVATION PROPERTIES (AF TER WALLACE, ET AL., 2001) Broad Type of Attribute Geometric Geometric/ Photometric Photometric (a) Elevation (b) Figure 1. Components of the ALFIE system: (a) Operations undertaken by the control modules; (b) The process flow of control modules guided by the control strategy. P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G Attribute Derived by a Value Method Length of linear Straight-line length from one end of linear to other Total curvature along the linear Maximum angle of turn along the linear Sinuosity index (length/straight-line length) Orientation of the linear Width of linear (similar spectral values alongside centre-line) Variation in width along the linear Dominant spectral value along the linear Mean spectral value along the linear Variation in spectral value along the linear (standard deviation) Gradient of spectral value along the linear Number of significant discontinuities in spectral value along the linear Dominant elevation along the linear Mean elevation along the linear Variation in elevation along the linear (standard deviation) Gradient along the linear Number of significant discontinuities in elevation along the linear December 2004 1375 LFX-551.qxd 11/9/04 16:25 Page 1376 on the fly by the method, rather than being stored as a static attribute, the information can be guaranteed to be up-to-date, automatically honoring any changes made to the database. Each vector primitive extracted from the image is stored as a separate object of class unclassified linear with the ability to derive attributes using the value methods (shown in Table 1) encapsulated within that object. Given this underlying data model, the extraction of linears and population of this geospatial database is initiated by the control strategy which then attempts to place objects within class single-carriageway road, dual-carriageway road, railway or river, and build up road class objects into a complete network. Process Flow Stages One of the central tenets of the ALFIE approach is its modularity. The control modules form discrete stages of the process flow, the order of execution of these modules and their interactions being governed by the control interface. These two components form the control strategy, introduced in a previous section. The sections following describe in more detail the processing undertaken by the control strategy. Preparation An integral part of ensuring a good quality output is the selection of imagery and relevant algorithms to perform the initial linear extraction. A number of algorithms available within QinetiQ© and in the public domain have been investigated. Each of these has been thoroughly tested to determine the suitability for different image types and domains. The optimal parameters with which an algorithm is best applied have also been differentiated. The results from this testing form a set of rules that are implemented in the form of look-up tables. The look-up tables are structured around the following input parameters: • • • • Imagery available; Linear feature types required; Level of confidence required; Timescale permitted for processing. The choice of imagery, algorithm, and parameters are affected by context regions, derived from land cover classification of coarse resolution imagery. Separate look-up tables have been created for the context region types supported by the ALFIE database schema namely, rural, urban, and forest. Table 2 shows an example of a look-up table for a particular context region showing a matrix of image type and linear extraction algorithm. For a given context region and image type available, the extraction algorithms which are available and appropriate are highlighted, and of those, the optimal choice is selected along with the appropriate set of parameters (although these are not illustrated here). Pre-Processing A major assumption of the ALFIE project is that imagery made available for input has been suitably georeferenced and radiometrically corrected, and working under this assumption the TABLE 2. EXAMPLE LOOK-UP TABLE USED BY THE PREPARATION CONTROL MODULE Algorithm Image Type Coarse—Resolution Multispectral Coarse—Resolution Panchromatic Mid—Resolution Multispectral Mid—Resolution Panchromatic High—Resolution Multispectral High—Resolution Panchromatic Susan Linefinder MSEL * * * * * * o o o o o o o o algorithm is appropriate for use with this imagery; * algorithm is optimal for this imagery. 1376 December 2004 potential effects of poor georeferencing have not been rigorously analysed within the project. Despite this, the imagery in its native form may not be optimised for the extraction algorithm selected in the previous control module. To improve the results of the extraction, tailored pre-processing such as edge enhancement to bring out linear edges, or smoothing and segmentation to reduce noise is appropriate in some cases. The required parameters for the pre-processing look-up table have been derived through sensitivity analysis, and their selection is fully automated. The introduction of new imagery or algorithms to the system would require additional preparatory analysis to update the pre-processing look-up tables. Collateral Extraction The majority of the feature information required comes from the linear extraction process. However, the classification and network building components require additional contextual information. These data are derived either directly from the provided imagery or from the standard military product VMAP Level 1, if available for the area of interest. The collateral extraction products that are derived are as follows: • • • • Water mask; NDVI (and vegetation mask); Image texture; VMAP junction locations. Water and vegetation masks are instrumental in refining the classification of linears, as typically noise features in rural areas may be misclassified as water features, while small-scale vegetation features such as hedgerows may be classified as roads. By applying water and vegetation masks, misclassified linears can be identified and removed. A limitation of the classification process is the difficulty in discriminating between road and railway features. Both appear radiometrically similar in panchromatic and multispectral optical imagery. However, a measure of image texture at high resolution can aid discrimination, as road tarmac is typically more homogenous in appearance than the combination of rail, sleeper, and gravel forming the railway line. The texture measure is used as a discriminant input into the classifier, as discussed in a section to follow. In some areas of interest, the standard military vector product VMAP Level 1 is available. If so, this can yield useful information, despite the fact that its accuracy (equivalent to 1:250000 map scale) is insufficient for direct comparison with the imagery, particularly at high-resolution. In the case of topologically structured datasets, the key nodes (e.g., junctions, bridges, and intersections) are likely to be more spatially accurate than the links. Therefore, all the nodes in the dataset are parsed, and those representing features of interest to the user are stored for later use. The links associated with those nodes are also stored, so the topological graph of the VMAP network can be traversed. This graph is then used as a guide during the network build process. Linear Extraction The automation of linear extraction generally takes one of two basic approaches: identification then extraction, or extraction then identification. The first of these two approaches is essentially an automated form of the manual digitising process using line following or snake algorithms as previously discussed. Such approaches can be very sensitive to local image properties often failing to detect any linears at all in some parts of an image, and are often tuned to a single feature type. The ALFIE approach overcomes these limitations by employing the second extraction method. In this case, every linear feature that can be detected is extracted as an unclassified primitive, an example using a Russian KVR image with GSD of approximately 2 m being shown in Figure 2. While this exhaustive P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G LFX-551.qxd 11/9/04 16:25 Page 1377 (a) (b) Figure 2. Typical output from the linear extraction module: (a) Original KVR image of Worcester, UK, and (b) Linears extracted from area shown in (a). method is expensive in terms of CPU time and data volume, this method ensures that all features of different classes are extracted, with the identification (or classification) of these features being performed later. This method is only made possible by using the OODB as a storage medium, and by using context in the identification process. The output illustrated is from the Linefinder algorithm, a centreline detector using a Marr-Hildreth filter, which is optimised for use with high-resolution imagery. The algorithm is particularly good at extracting broad features in less-urbanized regions, but suffers from fragmentation, requiring adjacent primitives to be grouped into longer, more complete features. The fragmentation of features depicted in Figure 2 is typical of a linear extraction from satellite imagery, whereby extraction quality is heavily affected by other objects in the scene. This is a particular problem in the urban area where object variability is high and a large number of occlusions are present (due to building overhang, shadow, or on-road features, such as vehicles). The degree of connectivity within the dataset can be improved by grouping the individual primitives. In order to join these disconnected primitives a snapping procedure, summarized in Figure 3, is applied across all the extracted linear objects. A radial neighbourhood at the endpoints of each linear Figure 3. Snapping and cleaning process. P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G object (source object) is examined. If one or more linear objects (target objects) are found within this search radius, a matching-algorithm is applied that considers the spectral properties of the original raster image and the orientation of the target and source objects. Source objects are matched with the target object whose spectral signature and orientation are most similar. If this similarity falls within a pre-defined tolerance, the two objects are merged into one. If no adjacent features are found within the search radius, then the feature is deemed a candidate for removal. The removal of very small, isolated primitives reduces the level of noise in the dataset. The grouping procedure is carefully controlled, with sensible search radii and strict similarity tolerances used, in order to prevent unreasonable snapping from occurring. Unconstrained snapping may allow primitives representing two completely different features, or separated by too great a distance, to be joined. Grouping is performed iteratively to allow features to be built up progressively, until a pass yields no further changes to the dataset. Classification The output from the linear extraction stage is an entirely unclassified set of linear primitives. The next stage is to derive a classification into one of the main functional groups (singlecarriageway road, dual-carriageway road, river, or railway) and to build up the primitives into a topologically structured network. The database schema provides information on every linear object through database methods. These methods are invoked by the classification module to build up a knowledge base comprising the information about each linear object as described in Table 1. Grouped linear objects are more straightforward to classify than heavily fragmented primitives, as metrics such as sinuosity are unrepresentative of the feature over very short distances. Engineering rules would suggest that the elevation of the underlying terrain should be a highly significant discriminator; for example, there are limitations in the acceptable gradient of a railway line, and rivers must have a consistent decline in elevation in order for the water to flow. However, the elevation information derived from the standard December 2004 1377 LFX-551.qxd 11/9/04 16:25 Page 1378 military product DTED, even at Level 2 (30 m post spacing), is not of sufficient resolution to provide the level of detail required to quantify the terrain. A number of techniques have been investigated as possible means for classifying the linears based on their attributes, from a straightforward multi-criteria sieving methodology using predefined engineering rules, through to a complex belief network. The chosen classification method should be fuzzy and provide some measure of probability for the accuracy of the outcome. For the classification of the linear features in ALFIE, a Cluster-Weighted Model (CWM) has been developed. The CWM is a Bayesian probabilistic model with a fixed architecture, but with flexibility in model parameters (Gershenfeld, et al., 1999). This combines the flexibility of a Gaussian mixture model with the benefits of using a general linear model to yield real or discrete valued outputs. The output from the CWM is a straightforward probability table, which has as many columns as there are discrete valued dimensions (see Equation 1). These discrete dimensions correspond to the database methods determined to be significant discriminators. Those attributes found to be most effective in discrimination were sinuosity, width, variation in width, dominant spectral value, and the variation in spectral value. p(y x, k) p(y k) p(y1 1 k) p(y1 1 k) o p(y1 L k) p(y2 1 k) p(y2 2 k) o p(y2 L k) p p ∞ p p(yM 1 k) p(yM 2 k) o p(yM L k) (1) where x is the set of input features, y is the set of output features, k is the set of clusters, L is the number of different classes and M is the number of discrete output dimensions. Every column in the probability table sums to unity. The CWM is trained using a manually created truth dataset representing a typical set of features where class membership is known. This is used first to determine the number of clusters to use for optimum output (the model order), and second to locate the position of those clusters within the multi-dimensional object space. A bootstrap test is used to determine the number of clusters, and then the truth data are compared with the output estimates. By adjusting the number of dimensions within the model, and the properties that these dimensions represent, a “best” performing model was created. As the properties of linear features are, to a certain degree, domain specific, a model was selected that performed most effectively for all context regions within the area of interest. A useful extension to the model would be the development of different classifier models for different context regions. Network Building Following the initial classification process is an iterative network build and classification refinement. This stage takes the thousands of linear primitives and the junction locations taken from VMAP and uses pattern-matching techniques to determine the important nodes in the extracted dataset. Network building follows this hierarchical approach by attempting to construct the major network structure first. The junction coordinates from the VMAP data are used to seed a junction building process, starting with the most significant (e.g., motorway intersections). The process begins with a radial search in the extracted dataset about each junction for all linear objects that intersect the search radius. Pattern matching algorithms are then applied to establish which of these linears most likely represent the arms of the junction. In order to prepare for the pattern matching the collection of extracted linears is filtered. Linears below a threshold 1378 December 2004 (a) (b) Figure 4. Detecting junction entrants using VMAP: (a) that start or terminate within the search radius; (b) that bisect the search radius. length are removed. Linears that intersect the search radius more than twice are removed. Linears whose start and end points are contained within the radius are removed. This leaves just two types of linear within the collection: (a) those that start or terminate within the search radius, end-point linears and (b) those that bisect the search radius, bisecting linears, illustrated in Figure 4. By filtering the linears, the number of potential junction entrants is reduced to a manageable level. Linears representing the features entering a junction are most likely to be of a reasonable length and begin within the radius or pass right through. The first step in the pattern matching process is to determine the positions of potential junctions. Given that the true number of junction arms A is known from the VMAP data, all combinations of extracted linears that can represent a junction with A arms need to be established. For this purpose a score is assigned to each of the linears remaining after the filter process: 2 for bisecting linears, (since these linears will represent two arms of a junction, see Figure 4) and 1 for end-point linears. For example a junction with four arms (A 4) can be comprised of two bisecting linears, or one bisecting linear and two end-point linears, or four end-point linears. Properties of the linears such as their estimated class, orientation, and pattern are analysed, and the linears most consistent with VMAP junction are built together into a junction object. After determining the locations of critical junctions and intersection nodes within the extracted linears, the interconnectivity between them is structured using the iterative approach shown in Figure 5. A regional approach is used, whereby every junction is visited in turn, and its connectivity to every other junction in its vicinity is assessed. The junction entrants are examined pairwise, and entrants of the same class are identified. A least-cost routing method is then used to fit a corridor between them. Cost is calculated from the radiometric values of the pixels between the junction points, with pixels of Figure 5. Iterative network building process. P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G LFX-551.qxd 11/9/04 16:25 Page 1379 Figure 6. Corridor building between two junctions. similar radiometric values to the feature of interest being given low cost. Linears already extracted are burnt into the image as areas of very low cost as they are most likely to represent the desired route between the two points (see Figure 6). If the cost of the created corridor is above a predefined threshold, or the corridor exhibits unexpected properties, such as a sudden change in radiometric value, then it will be discarded. This iterative process continues by detecting additional junctions. This includes those present in the VMAP dataset, that have not been successfully detected in the extracted dataset, and further junctions not present in the VMAP dataset. Corridors are then built out from these junctions in order to complete the network further. Junctions not present in the VMAP dataset are identified by using pattern-matching techniques similar to those employed with a known VMAP junction location. In this case, endpoints of significant linears are used as the seed locations, around which a radial search is performed. Any linears falling within the search radius are assessed for their classification, similarity, pattern, and orientation. If sufficient high-scoring linears are found to converge upon a point, then a potential junction is flagged at this location. This potential junction is used in the corridor building process, although its thresholds and constraints are higher than junctions detected using VMAP. The pattern matching technique used is quite basic and sensitive to the quality of the local primitives, however, it is effective at detecting the location of distinct converging linears. A number of authors have described more complex methods, such as the Connective Hough Transform (Yuen, et al., 1993) or probabilistic relaxation approach (Matas and Kittler, 1993), which could be employed at this stage. Alternatively, Lindeberg (1998) describes a multi-scale approach to junction detection directly from raster images, which could be used at the collateral extraction stage to provide an additional junctions dataset. Validation The final stage of processing provides a means for the end user to determine the success of the process flow and the completeness of the output. The validation component of the ALFIE system comprises a set of database methods allowing quantitative and qualitative assessment of the results. The validation methods available are summarised in Table 3. Display methods show graphically the areas of the network that are incomplete or incorrect, while process methods parse the complete dataset and provide a set of summary statistics. P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G TABLE 3. VALIDATION METHODS Validation Metric Type of Validation Completeness of network Confidence of classification Correctness of classification* Correctness of classification (by class)* Reclassification during network build Final classification of network Display and Statistic Display and Statistic Display and Statistic Statistic Display and Statistic Display *truth data must be available in order to use these validation metrics. Several methods could be regarded as offering an internal validation and require no reference truth dataset. One such example relates to the completeness of the network with respect to the generalized VMAP network used in junction seeding. These statistics and graphics effectively present the degree to which the connectivity of the VMAP graph is replicated in the extracted graph. Although VMAP is used for seeding junctions, the network building will not always be able to complete network elements between any two seeded junctions, possibly due to the failure to detect one of these junctions in the extracted graph. Other methods report on the classification of linears, beginning with a graphical display of the level of confidence of each linear in its own classification, based around output from the classifier. Any reclassification of linears during the network building stage can be highlighted, and the final classified vector map displayed. The object-oriented data model being employed allows for more dynamic validation at the object level and offers an area for continued research. The degree to which a line or junction forms part of an overall network could be a valuable validation procedure, and a possible mechanism for this could be the use of agent methods. Objects being aware of their place in a network has clear parallels with intelligent map generalisation (Lamy, et al., 1999). A junction may be locally connected and formed by a legal configuration of entrant linears, but in terms of its wider connectivity, it would be useful if agent methods (or other appropriate techniques) could establish whether entrants at a particular junction connected to other junctions. An index of connectivity could be calculated and displayed for each junction that conveys the degree to which it is fully connected. Two final methods calculate the correctness of the classification; this absolute assessment of accuracy requires a truth dataset against which to compare the extraction. Therefore, December 2004 1379 LFX-551.qxd 11/9/04 16:25 Page 1380 these methods are designed to test the system conceptually during development, but would be inappropriate for use with an operational system. The effective graphical representation of validation information has proved important during system development in addition to its value for an operational system. The flexibility to graphically roll back to any stage of the processing offers a tool for exploring the output from each control module. Results and Discussion Results The project aim was to demonstrate a flexible and modular framework for the automated extraction of geospatial information from imagery. A complete process flow has been presented with each stage of processing calibrated and automated by the overarching control strategy. Contextual information is used to provide knowledge for discriminating between the different classes of linear feature. Table 4 presents a confusion matrix for the initial classification of features for the test area illustrated in Figure 7. Table 5 details the overall classification success for both urban and rural areas before and after the network building procedure. The quality of the final output from the ALFIE system, in terms of network completeness and classification accuracy, is TABLE 4. Actual Extraction Result Feature CLASSIFICATION CONFUSION MATRIX Road—Dual Carriageway Road—Single Carriageway Railway River 80% 0% 0% 0% 15% 56% 31% 0% 5% 0% 31% 12% 68% 1% 3% 97% Road—Dual Carriageway Road—Single Carriageway Railway River heavily dependent upon the quality of the initial extraction. The algorithms currently employed by ALFIE, which were considered to be the most appropriate available at the time, are not able to produce perfect output, particularly in terms of maintaining the connectivity of the linear network. Figure 7 shows some sample output from the complete ALFIE process flow for a test area in Worcestershire, England compared with a manually-produced truth dataset. Although the linear network produced by the ALFIE process is significantly less complete than both the truth dataset and the VMAP vector product, it is a potentially more precise, spatially accurate, and topologically correct representation of the detected real-world features in the area due to the relatively high spatial resolution of the source imagery. As can be seen, the major road intersection in the north-east of the region has been extracted particularly well. Additionally, the railway cutting through the image from north to south is detected well, although some confusion can be seen where multiple TABLE 5. OVERALL PERFORMANCE OF CLASSIFICATION AND NETWORK BUILD PROCESSES Criterion and Context Region Classification Confidence Classification Accuracy VMap Junctions Detected Linears Extracted Network Extracted Network Connectivity Pre-Network Build Post-Network Build 89% 89% 78% 52% n/a n/a 72% 37% 18% 16% n/a n/a 100% 100% 99% 96% 53% 27% n/a n/a 70% 21% 65% 60% Rural Urban Rural Urban Rural Urban Rural Urban Rural Urban Rural Urban (b) (a) Figure 7. (a) True transport network; (b) ALFIE extracted network. 1380 December 2004 P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G LFX-551.qxd 11/9/04 16:25 Page 1381 tracks run parallel to each other. The river feature in the center is extracted well in rural areas but is lost as it enters the urban area. Two examples of erroneous extractions are highlighted in Figure 7b. The feature running across the center of the area illustrates how a series of minor road segments have been linked together to form an extracted single carriageway road. The large river towards the left of the area is not extracted due to its width exceeding the threshold of the extraction algorithm, however, the river is extracted as part of the water mask as previously described. Extraction of the network within the urban area is less effective than within the rural area. Although the extraction algorithm detects a significant percentage of the linear features, their appearance is too fragmented for a successful network build. Furthermore, the extraction of building line edges and other urban clutter reduces the discriminatory capability of the classifier. In areas of clean extraction, the classification and network building processes works well. The algorithms currently available within ALFIE are less effective in extracting linear networks within the urban area. However, the ALFIE approach is designed to be future proof, so that new developments in linear extraction algorithms can be added when they become available without affecting the overall structure of the system. Implications of the Research The digitisation of geospatial data and the extraction of information from imagery are manually intensive, time-consuming tasks. The drive to develop automated techniques to extract this information spans all areas that can benefit from geospatial data. The impetus for the ALFIE project was the military requirement for the rapid and cost-effective generation of tailored geospatial data for mission planning and rehearsal. However, the techniques employed are equally applicable to any application, military or civilian, requiring geospatial data and information extracted from image sources. A particular application of interest is the updating of cartographic databases for national map revision. The control strategy approach provides flexibility in operational requirements. This allows a priori information to select processing techniques to optimise data capture, ensuring output is tailored to the application requirements. Furthermore, its modularity provides a means of future-proofing, whereby new technologies and algorithms can be added at a later date. On-Going Research Further evaluation of the system against alternative methods, including manual digitising, would be beneficial, adding speed and ease of use to the validation metrics referred to in Table 3. Many aspects of the research at a control module level offer opportunities for on-going research. These include the incorporation of paired outlines from edge extractors in addition to the centerlines currently extracted by the ALFIE system. The way in which contextual information is translated into inputs to the classifier, and the architecture of the classifier itself, are subjects for future study. A follow-up project ALFIE-2 is extending the ALFIE system to 3D and areal feature extraction. This information can then be used to improve the classification of features by providing additional contextual cues. Extending the ALFIE approach to incorporate 3D features brings a number of other benefits for SNE database generation: • • Significant value can be added to the data when compared to standard products such as VMAP. For example, 3D topology can be constructed at the time of extraction representing over/ underpasses and ensuring consistency with the underlying terrain. Up-to-date terrain data can be extracted from the same imagery used to extract the linear and areal features with the terrain extraction tailored to specific requirements. P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G • The direct extraction of 3D information (as opposed to the use of existing Digital Elevation Model products such as DTED, which are limited by resolution) provides additional contextual clues, for example gradients, which can improve the classification of features. Groupings of buildings can be more reliably discriminated and therefore be included as local contextual clues within the classification process for linear features such as roads. The improved classifications can in turn refine the accuracy of the feature matching process required to extract the 3D information. There is therefore a positive feedback loop. Conclusions A framework has been presented which supports the rapid generation of transport networks from imagery. A control strategy manages a process flow of information with control modules, utilizing the power of an underlying object-oriented geospatial database. The architecture has been designed with future developments in mind, being based around a toolkit of available algorithms and imagery. The system is flexible and extensible using the modular addition of any new algorithms or techniques, ensuring that the overall framework of the process flow remains the same. The modular ALFIE system has proved a powerful development environment, allowing the recognition process to be followed through and sensitivities within and dependencies between modules studied. The control strategy has helped to manage the complexity of the problem and has allowed contextual information to be incorporated in various ways throughout the process flow. The adoption of an object-oriented geospatial database has allowed complex discriminating characteristics of objects to be dynamically extracted, effectively enabling objects to classify themselves. SNE databases require much more than 2D linear networks and so the research is currently being extended to investigate the extraction of areal and 3D features. This extension to incorporate 3D information also offers powerful additional contextual clues, which can be used to improve the classification of linear networks. References Abramovich, T., and A. Krupnik, 2000. A quantitative measure for the similarity between features extracted from aerial images and road objects in GIS, International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B3, Amsterdam 2000, unpaginated CD-ROM. Baumgartner, A., W. Eckstein, H. Meyer, C. Heipke, and H. Ebner, 1997. Context-Supported Road Extraction, Automatic Extraction of ManMade Objects from Aerial and Space Images (II) (A. Gruen, E.P. Baltsavias, E.P. and O. Henricsson, editors), Birkhauser Verlag, Basel, pp. 299–308. Bordes, G., G. Giraudon, and O. Jamet, 1997. Road Modelling based on a cartographic database for aerial image interpretation, Semantic Modelling for acquisition of topographic information from images and maps (W. Förstner, and Plümer, editors), Birkhauser Verlag, Basel, pp. 123–139. Doucette, P., P. Agouris, A. Stefanidis, and M. Musavi, 2001. Selforganised clustering for road extraction in classified imagery, ISPRS Journal of Photogrammetry and Remote Sensing, 55: 347–358. Forghani, A., 2000. Semi-automatic detection and enhancement of linear features to update GIS files, International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B3, Amsterdam 2000, unpaginated CD-ROM. Gershenfeld, N., B. Schoner, and E. Metois, 1999. Cluster weighted modeling for time series analysis, Nature, 397. Gruen, A., and H. Li, 1995. Road extraction from aerial and satellite images by dynamic programming, ISPRS Journal of Photogrammetry and Remote Sensing, 50(4):11–20. December 2004 1381 LFX-551.qxd 11/9/04 16:25 Page 1382 Hinz, S., A. Baumgartner, H. Mayer, C. Wiedemann, and H. Ebner, 2001. Road extraction focussing on urban areas, Automatic Extraction of Man-Made Objects from Aerial and Space Images (III) (E. Baltsavias, A. Gruen, and L. Van Gool, editors), Balkema, Lisse, pp. 255–266. Hofmann, P., and W. Reinhardt, 2000. The extraction of GIS features from high resolution imagery using advanced methods based on additional contextual information—first experiences, International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B4, Amsterdam 2000, unpaginated CD-ROM. Lamy, S., A. Ruas, Y. Demazeau, M. Jackson, W.A. Mackaness, and R. Weibel, 1999. The Application of Agents in Automated Map Generalisation, Proceedings of the 19th ICA/ACI Conference, Ottawa, Canada, pp. 160–169. Linderberg, T., 1998. Feature detection with automatic scale selection, International Journal of Computer Vision, 30(2):79–116. Matas, J., and J. Kittler, 1993. Junction detection using probabilistic relaxation, Image and Vision Computing, 11(4). McKeown, D., W. Harvey, and J. McDermott, 1985. Rule-based interpretation of aerial images, IEEE Transactions on Pattern and Machine Intelligence, PAMI-7 (5):570–585. NIMA, 1995. Vector Map (VMap) Level 1 Specification, URL:http:// earth-info.nga.mil/publications/specs/printed/ 89033/ VMAP_89033.pdf (last date accessed: 23 September 2004). Priestnall, G., and R. Glover, 1998. A Control Strategy for automated land use change detection: An integration of vector—based GIS, remote sensing and pattern recognition, Innovations in GIS 5 (S. Carver, editor), Taylor and Francis, London, pp. 162–175. Priestnall, G., and S. Wallace, 2000. Semi-automated linear feature extraction using a knowledge rich object data model, International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B3, Amsterdam, 2000, unpaginated CD-ROM. Sonka, M., V. Hlavac, and R. Royle, 1999. Image Processing, Analysis and Machine Vision: (2nd Edition) Brooks/Cole Publishing, 828 p. Steger, C., H. Mayer, and B. Radig, 1997. The role of grouping for road extraction, Automatic Extraction of Man-Made Objects 1382 December 2004 from Aerial and Space Images (II) (A. Gruen, E.P. Baltsavias, and O. Henricsson, editors), Birkhauser Verlag, Basel, pp. 245– 256. Stilla, U., and E. Michaelsen, 1997. Semantic modelling of man-made objects by production nets, Automatic Extraction of Man-Made Objects from Aerial and Space Images (II) (A. Gruen, E.P. Baltsavias, and O. Henricsson, editors), Birkhauser Verlag, Basel, pp. 43–52. Teoh, C.Y., and A. Sowmya, 2000. Junction extraction from high resolution images by composite learning, International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B3, Amsterdam 2000, unpaginated CD-ROM. Tonjes, R., and S. Growe, 1998. Knowledge Based Road Extraction from Multisensor Imagery, Proceedings of the ISPRS Symposium on Object Recognition and Scene Classification from Multispectral and Multisensor Pixels, Commission III, Working Group 4, 06–10 July 1998, Columbus, Ohio, USA. Vosselman, G., and J. de Knecht, 1995. Road tracing by profile matching and Kalman filtering, Automatic Extraction of Man-Made objects from Aerial and Space Images (A. Gruen, O. Kuebler, and P. Agouris, editors), Birkhauser Verlag, pp. 265–274. Wallace, S.J., M.J. Hatcher, R.G. Ley, G. Priestnall, and R.D. Morton, 2001. Automatic differentiation of linear features extracted from remotely sensed imagery, Österreichische Zeitschrift für Vermessung und Geoinformation, Heft 34:17–29. Wang, Y., and J. Trinder, 2000. Road network extraction by hierarchical grouping, International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B3, Amsterdam 2000, unpaginated CD-ROM. Wiedemann, C., 1999. Completion of automatically extracted road networks based on the function of roads, Automatic Extraction of GIS Objects from Digital Imagery (H. Ebner, W. Eckstein, C. Heipke, and H. Mayer, editors), International Archives of Photogrammetry and Remote Sensing (32) 3-2W5. Worboys, M.F., 1994. Object-oriented approaches to geo-referenced information, International Journal of Geographical Information Systems, 8(4):385–399. Yuen, S.Y.K., T.S.L. Lam, and N.K.D. Leung, 1993. Connective Hough Transform, Image and Vision Computing, 11(5):295–301. P H OTO G R A M M E T R I C E N G I N E E R I N G & R E M OT E S E N S I N G
© Copyright 2026 Paperzz