Automatic Extraction
Written by Peter Buxbaum
GIF 2011 Volume: 9 Issue: 2 (March)
As the volume of imagery grows, analysts need
software help in recognizing specific objects,
from hills to roads, in digitized data.
The availability of increasing numbers of imagery sources is making greater automation of feature extraction an approaching reality, even as the exploding volume of available multi-source imagery is making it a necessity.
Automated feature extraction is a capability that allows software to recognize certain specific objects represented in digitized imagery or other data, such as light detection and ranging (LiDAR) point clouds. Programming software to be on the lookout for topographical features such as hills, or man-made objects such as buildings, vehicles or power transmission lines, allows those features to be separately and distinctly portrayed in the intelligence end-products created by analysts for the benefit of planners, decision-makers, commanders and warfighters.
Fusing data from multiple sources, such as panchromatic, hyperspectral and LiDAR sensors, increases the probability that features can automatically be extracted. Such a process identifies features such as buildings, vegetation or bodies of water by using a combination of spectral and elevation characteristics. The huge volume of available imagery and data makes it impossible to exploit these in the absence of automation.
While technology providers are still at work on the algorithms and software that could make completely automated feature extraction an eventual reality, the process nowadays continues to involve a collaboration between human and machine. The eventual goal is for the machine to play a greater, and the human a smaller, role.
Segmentation Capability
Automated feature extraction revolves around a few different technologies. One key capability is segmentation, in which an image is divided into finer or coarser elements at the direction of the user. Other technologies include machine learning capabilities, which can improve the performance of the algorithms used to analyze images.
“Segmentation represents a powerful set of capabilities that have been introduced in the last few years,” said Stuart Blundell, vice president for geospatial products and solutions at Overwatch Systems, an operating unit of Textron Systems, a Textron Inc. company. “That, in combination with machine learning approaches, represents the forefront of automated feature extraction.”
Feature extraction is a simplified way of stating that an analyst is trying to get information out of imagery and place it into usable packages. This could refer to a manual or automated process, or some combination of the two.
“Analysts often are looking for specific features such as road networks, waterways, rooftops or ships in a harbor,” said Pete McIntosh, manager of solutions engineering at ITT VIS, added. “Feature extraction tools analyze the pixels of digitized imagery and the relationships among pixels to form a piece of information. There are a multitude of new applications involving the analysis of imagery data.”
More complex feature extractions, such as those identifying each object in a broad landscape scene and adding descriptive attributes to them, continues to be more of a manual than an automated process, noted Tom Lobonc, product line director for defense at ERDAS.
“Computers are still not 100 percent successful at most feature extraction operations,” he said. “In some cases computers can do things quickly and reliably, but other operations humans can do better. In certain processes you can merge those two together. The operator can select tailored algorithms and provide problem-specific input to the algorithm working on the extraction, which can significantly reduce the amount of manual intervention necessary. The overall process goes more quickly because the computer is doing things a lot faster than the human operator alone can.”
Terrain elevation information can often be automatically extracted from LiDAR point cloud data, said Blundell. “Some data sets lend themselves to different levels of automation,” he explained. “Because of the way LiDAR data is formulated and processed, an algorithm can often run on LiDAR data autonomously to extract terrain data without human intervention.”
Computers do well at automatic point matching from multiple images, a process which is often used for terrain elevation extraction, according to Lobonc. “Humans still need to follow up on the back end to clean up anything the computer didn't catch,” he said.
More complicated is the extraction of three-dimensional wire frames of objects from images. “The computer is very good at finding pixels on edges, but not as good at aggregating the individual pixels to form a continuous clean edge, or identifying common edges between two images, which is required to form the 3-D wireframe,” said Lobonc. “The human operator performs this edge matching instantly and subconsciously during a manual collection.”
Law of Homogeneity
Computers have also proved efficient at compiling general land features over large areas from multispectral images. Here again, the machine can benefit from human assistance. “Before starting the algorithm, the operator can tag certain areas as water or land and others as not water and not land,” Lobonc explained. “The computer then analyzes the characteristics of the pixels in those areas and will identify groups of pixels with similar spectral and spatial properties as being of the same type of feature.”
This is accomplished through segmentation, the parameters of which are controlled by the human operator. The user can choose finer levels of segmentation, in which an image is divided into relatively small pieces for a finer analysis of its contents, or coarser level of segmentation, better suited for bigpicture analyses. The “law of homogeneity” then comes into play, in which the computer, running specialized algorithms, groups pixels together into recognizable objects based on factors such as proximity and color.
“The law of homogeneity says that things close to each other are probably similar,” Blundell explained. “Green pixels that are close to each other are likely to be a body of water, or a field. The program can the use operator controlled rules to link objects together.”
Objects inherently contain much more information than individual pixels, and so represent a big jump in the eventual identification and extraction of specific features, noted McIntosh. “Pixels give you nothing more than spectral characteristics,” he explained. “The information you get from objects includes how big they are, how round or elongated, the relative level of uniformity of the pixels, and other attributes. Looking at objects rather than pixels is closer to how our human brain works to identify features.”
The bottom line is that with the explosion of imagery and other geospatial data being gathered by satellite and airborne sensors, some level of automation is required if that data is to be put to use. “Sensors are collecting 1.5 million square kilometers of images per day, and the current archives for the whole community now stand at 14 petabytes of imagery data,” said Blundell. “The higher resolution data we now have available also represents a catch-22. The higher the spatial resolution, the shorter the shelf life of the imagery. Things that are important on a smaller scale are moving around and information is changing quicker. You have to refresh your archive faster and extract information more quickly in order to keep up with the requirements for decision-making.”
All of this suggests a requirement for greater reliance on automated extraction tools. ITT VIS is focusing on developing tools that allow users to define the level of detail to be examined in imagery, allowing the extraction tool to be used for wide landscape monitoring or to look for specific features.
The first step is for the user to set the level of segmentation. “There are then several steps that can be applied to refine those results,” McIntosh explained. “The user can merge or split segments depending on the desired level of detail in the image to be analyzed. Our tools allow users to define templates and rules to apply to different sets of imagery. Setting a template or a rule to find a road network, for example, can be reused in the future from a library to find the same type of feature in other imagery.”
The tool can also look for multiple features at the same time, such as road and river networks, and can be adapted to widespread landscape analysis and monitoring and to perform change analyses over time. “What happens if you want to do both?” said McIntosh. “We find that it is better to analyze the data in serial rather than simultaneously. The tool allows users to make multiple passes at the data for different purposes.”
Machine Learning
Feature extraction tools have made use in recent years of the technology called machine learning, according to Blundell. “With machine learning, the computer is capable of learning from the user,” he explained. “Once it learns what the user wants, it is capable of finding it. If the user corrects information when the computer was wrong or indicates that the result was correct, the algorithm is capable of taking that information and improving its performance over time.”
The improved algorithm performance supports a higher level of automation and less of a requirement for human intervention.
Overwatch has developed a tool called Feature Analyst, which incorporates machine learning capabilities that are integrated into commercially available geographic information systems. “Building tools inside GIS systems allows users to work within a framework where they can edit information and transform it with appropriate attributes to support mapmaking,” said Blundell. “It also facilitates the process of refreshing GIS databases and developing search capabilities for targets of interest based on spectral or spatial anomalies.”
The system allows the user to click on the computer screen to indicate what features he or she wants to find. “The algorithm learns from that example and predicts where others in similar images may be,” said Blundell. “The user can correct the algorithm if necessary and then run it again.”
Feature Analyst is not completely automated. The human operator creates an automated feature extraction model that can then run autonomously against other image data sets to extract other features. It is also possible to run the software autonomously on multiple data sets without user intervention.
ITT VIS also strives to incorporate the results of its feature extraction analyses in larger GIS work flows, noted McIntosh. “Most feature extractions are part of larger geospatial projects such as mapmaking,” he said. The company recently added functionality that allows its Feature Extraction Module for ENVI to be accessed through Esri’s Arc- Map, part of ArcGIS. “This streamlines the process further,” said McIntosh, “and makes it more accessible to geospatial users.”
ERDAS recently released a new tool, Imagine Objective, that marries algorithms that measure the spectral characteristics of data with object space processing. “Historically these algorithms have been working primarily on one space or another,” said Lobonc. “Combining the two gives users a whole host of options.”
The software allows users to select from or create new “operators,” each of which performs relatively small tasks such as image segmentation, converting objects into vectors, or squaring off corners, each of which facilitates automated feature extraction. These operators can then be combined into reusable models for extracting specific types of features. “One issue with these types of systems is that it often takes an expert to run them,” said Lobonc. “With this tool we make these expert-designed processes available to a wide range of users.”
ERDAS’ tool has been used to combine imagery and LiDAR data to model urban structures and compare those against databases to determine if any change has taken place.
For Lobonc, the real driver of recent progress in the area of feature extraction is less the core algorithms used to perform the tasks than other factors such as the resolution and accuracy of the data being worked on and the fusion of multiple algorithm types and multiple data sources. “Imagery from multispectral sources has gotten much better and of higher resolution,” he said. “The algorithms provide better results when they work with that imagery and when they are able to use multi-source data. You can get certain types of information from pixel values, elevation data from LiDAR, and feature information from databases. Advances in computing and storage capabilities of recent years have also contributed to this progress.”
Lobonc sees continued incremental progress as image quality improves and progresses in processing power, allowing the efficient application of algorithms to more diverse and rich data sets. “Algorithms will run faster, have a higher reliability, be able to discriminate at a finer level of detail, and be able to provide much richer attribution about the feature being extracted,” he said.
Object-Based Extraction
McIntosh expects ITT VIS to be adding features that allow users to incorporate their own segmentation algorithms into the Feature Extraction Module. He also expects advances in object-based feature extraction.
“Currently, objects give us a very attribute- rich segment,” he said. “We have added more robust elements to feature definition with very targeted rule sets. The next step will be to start defining a lot of features up front before we even have segments or objects. We’ll start by saying, ‘I've got roads or a river network to find.’ The user will enter that into the interface and the extraction tools will do their work based on more intelligent input from the user and settings that are applied based on what the user is trying to extract from the imagery.”
The company is also planning on making its tool more interactive with a preview window displaying instantaneous results as the user changes to tool settings. “It could take a user hours to process blindly two gigabytes of panchromatic imagery,” said McIntosh. “That’s a lot of time to waste if the analysis doesn’t work out. Getting instant feedback will be an enormous time saver and allows the user to tweak the settings as he goes along.”
McIntosh also foresees the development of segmentation and refinement tools for specific classes of features. “Right now we have templates and rules that can be saved in libraries,” he explained. “We’ll start seeing libraries on the other end of things, so that if you need to extract roads you'll pull up the road extractor. If you need to extract buildings, you’ll pull up a ready-made tool for that task.”
Blundell sees future automated feature extraction entering a golden age in the next few years. “Today we are often asked to extract features from a single image, which is like trying to grow roses in flinty soil. Satellites are providing higher resolution images and growing numbers of spectral bands. All that allows for a much richer set of information to draw from.”
Lobonc offers a different perspective. “What I don’t foresee in the immediate future is the Holy Grail, where you can turn an algorithm loose on a complex extraction task and provide a 98 percent success rate 90 percent of the time with 90 percent of the imagery,” he said. “We are not at the point where a computer can mimic the human brain in instantaneously recognizing objects in imagery such as trees, streets, curbs and buildings.
“We have made progress in automating feature extraction, but at some point we have to bring the human back in the loop,” he added. “Unless and until someone makes a big breakthrough in replicating human visual perception, the emphasis is going to be improving the performance of applications and refining the interactive human-machine process.”
But what about automatically extracting 100 percent of features with proper attribution? “That is something we’re all still chasing,” said Lobonc.
Algorithm Issues
Automatic extraction from imagery has been very successful for image registration and digital terrain model generation. Other areas such as 3-D building and house extraction and road network extraction are not so successful. The radiometric properties or spectral characteristics of features are very complex and variable.
Because of the different colors and patterns, it is very difficult for any algorithm to extract features automatically from imagery alone. Algorithms that work well with one set of images may not work at all with a different set, because radiometric properties are often very different. In the last few decades, researchers have developed many automatic feature extraction algorithms with limited success.
The automation in image registration and terrain extraction is much simpler than automation in other types of feature extraction. In terrain extraction, all that is needed is to find conjugate pixels from two images or image matching. Although much simpler, image matching is the most researched algorithm in computer vision and image processing, and it is still not a completely solved problem.
Because of the success with automatic terrain extraction, we can take advantage of the invariant property in a digital surface model to automatically extract other types of 3-D features such as buildings, houses and trees. All of the 3-D features have one common property—they are above the ground, and this property does not change from image to image, from sensor to sensor, from time to time.
Algorithms that deal with only one invariant property (above the ground) are much simpler and therefore much more likely to be successful. At BAE Systems, engineers and scientists are taking this approach for automatic 3-D feature extraction. The company’s new automatic feature extraction functionality takes a digital surface model as input and automatically extracts 3-D buildings, houses and trees. The digital surface model may be generated by any image matching software such as BAE Systems’ Next-Generation Automatic Terrain Extraction system, or from LiDAR. Although originally designed for modeling, simulation and visualization, automated feature extraction has potential for use in robotic and UAV applications. The same algorithms could be used to extract and identify other types of 3-D objects such as vehicles, airplanes and people.
“I expect accelerated and significant advances in automatic feature extraction and object recognition in the next few years from other professions such as robotic and computer vision,” said Dr. Bingcai Zhang, engineering fellow, BAE Systems. “Companies like Facebook, Google and Microsoft are developing algorithms to recognize features, or to navigate autonomous vehicles. The demand for UGV and UAV is high, and a realtime 3-D automatic mapping component may be needed for navigational purposes. GPU computing is going to be an enabler for this accelerated advance. A graphics card can have hundreds of processing cores that are ideal for massive parallel image processing and automatic feature extraction.” ♦







