Computing Clouds Cast Geospatial Vision

Attention: open in a new window. PDFPrintE-mail

Computing Clouds Cast Geospatial Vision


Swelling demand for imagery prompts plan for a
shared-services approach to data and applications.


Cloud computing, the increasingly popular IT concept that uses a cloud to symbolize the Internet as the data and application services provider, shielding users from the underlying complexity, is extending its reach into the world of geospatial intelligence. The National Geospatial-Intelligence Agency envisions establishing a GEOINT distributed computing cloud, contained within a larger, high-performance cloud, to achieve many of the architectural objectives in the Department of Defense and intelligence community missions.

NGA’s interest in virtualization technology for storage, networks and server processing, and its alignment with service-oriented architecture (SOA) objectives advocated by DoD and the director of national intelligence, are paving the way toward consolidated cloud computing strategies for geospatial information.

This comes at a time when geospatial image providers, such as GeoEye and Digital Globe, have experienced swelling demands. “The key trends are speed and volume. The volumes of data will continue to rise, along with the demand for increased response time and larger, higher resolution areas,” said Brian O’Toole, GeoEye chief technology officer.

DoD and intelligence clients not only require more images faster; they also want better, more useful images. “The military wants larger areas that can be accommodated by one image, so we need to put multiple images together without any visible break lines between the images, to produce orthomosaics and orthoimages. The value-add requires additional processing of the images into the equivalent of maps,” said Ray Helmering, GeoEye vice president of engineering.

Such volume, speed and variety of data presuppose the need for high performance computing and easier access to information. “We’ve seen a growing trend by the military and intelligence community to employ Internet-based solutions for Web hosting of geospatial data,” said Stephen Wood, vice president of business operations and U.S. defense sales, Digital Globe. “The sharing of geospatial data and the setting up of easily accessible geospatial cloud communities is where we are headed.”

Cloud computing essentially utilizes virtualization and SOA in a networked, high performance, highly scalable and easily resizable computing paradigm, adding fault tolerance for guaranteed reliability and other “ilities,” as NGA refers to them. Information is stored on network servers, but cached temporarily on demand in client environments from desktops to wireless handhelds. Similar to comparing intranets with the Internet, private clouds provide better security than public clouds.

Because many terms get bandied about when the subject of cloud computing is raised, experts like to clarify. “As some technology experts believe, virtualization gives you the illusion of your own computer, while cloud computing gives you the illusion of your own data center,” said Joe Kraska, manager, federal data center research and prototype operations, BAE Systems. “And grid computing is more static, while cloud is elastic and grows.”

However, elasticity and growth raise the need for sophisticated security, an area BAE is now addressing. “We are looking at ways to deploy private clouds that would satisfy our customers’ security requirements,” said Kraska.

VARYING IMPLEMENTATIONS

Like the industry’s varying emphases on the definition, many different cloud implementations will likely emerge, given the recognized IT benefits. “Cloud computing virtualization provides an IT avenue to achieve many of the architectural ‘ilities’ that our customer missions require: reliability, availability, scalability, agility, interoperability and so on. SOA objectives are leading our communities of interest to drive toward interoperable, Web-enabled services offerings, potentially allocated and bundled to reflect centers of excellence,” said Christopher Cuppan, National System for Geospatial- Intelligence (NSG) chief architect, NGA Office of the Chief Information Officer.

Many of NGA’s worldwide data providers are, in themselves, discrete centers of excellence that could reside on a cloud platform, collectively forming clouds within a larger cloud. “Our customers require seamless access to this worldwide heterogeneous GEOINT data domain for posting, discovery, retrieval, synthesis/integration, exploitation and value-added update posting. They also require minimal latency in the content currency of this data domain to include value-added post-processing. Virtualization, thus cloud architectures, provides an excellent architectural pattern—albeit more of a metaphor—to satisfy these customer needs,” Cuppan said.

Cuppan’s caveat—that cloud computing is still more of a metaphor than a finite construct— rests primarily on the lack of security and standards for this fast-growing but nascent technology. “Our concerns regarding security are sizable. As the myriad missions of the NSG often operate within multiple high security domains, we cannot have our data literally ‘disappear into the clouds’ when it is processed or exploited,” he said.

This gives rise yet again to the time-honored IT quandary about the balance of security and flexibility. With cloud computing, the quagmire looms ever larger, given the lofty but still uncertain promise of this new framework to deliver concurrently not just flexibility and reliability, but also accessibility, security and the highest possible scalability. “Securing our data/knowledge may in fact impede accruing the processing benefits of the cloud. Conversely, the absence of processing control responsibilities levied on users—a desired attribute of cloud architecture—may undermine our requisite positive security controls,” Cuppan said.

As a result, cloud architectures will levy heavy responsibilities on their supporting infrastructures, which will define and probably innovate the way secure, standardsbased cloud computing is structured for the defense and intelligence community. “The requirement to implement comprehensive data, attribute-level security and digitalrights- management capabilities appears certain. Thus, not just any infrastructure may be eligible for cloud processing membership. This may certainly narrow down viable extant infrastructure candidates,” he said.

A flurry of commercial market activity revolving around cloud computing this year raised high expectations, while the realities of what it will take to deliver on them remain to be seen. “Cloud computing as a technology framework has evolved so quickly from a perfect storm—a lack of power, space and funding that focused people to share and optimize. It exploded to prominence before anybody had enough time to codify the standards and security for it, but industry is working quickly to figure out the solutions to these challenges,” said Robert Ames, director, deputy chief technology officer, IBM Federal.

Prominent commercial cloud introductions in 2008 included Microsoft’s Windows Azure, an offering that competes with Amazon. com’s Elastic Compute Cloud with the flexible combination of software and services that characterize cloud computing. Google introduced the Google App Engine for dynamic scalable Web serving, storage and automated load balancing. Amazon, meanwhile, announced Amazon CloudFront, a self-serve, payas- you-go Web service for content delivery with low latency and high data-transfer speeds.

STANDARDS GROUPS

While industry developments outpace standards and security progress, established standards groups have delivered the technology on which NGA operates. “The NSG is based heavily on standards and services for geospatial information discovery, access and general transactions as published by the Open Geospatial Consortium (OCG),” said Cuppan.

Transactional services that are part of the OGC’s Spatial Data Infrastructure (SDI) and Sensor Web Enablement (SWE) standards suites assume highly distributed Web-based data and metadata, databases, sensors, collectors and processing capabilities across the NSG. As a result, NGA clients can discover and access information, such as sensor-to-sensor syndicated alerts, without any awareness of the underlying infrastructure delivering this capability, one characteristic of cloud computing.

Meanwhile, the Open Grid Forum (OGF) has been developing international standards to hasten the adoption of grid computing and other distributed technologies such as virtualization, SOA and cloud technologies. IBM, Intel and Microsoft are members of the OGF.

The OGC has been working with the OGF to develop standards for cloud computing in geospatial technology. The two groups developed Web Processing Service (WPS), a standard method of workflow to process raw data into more valuable information for decision- support systems.

“WPS will identify a feature in an image, for example, to make the image smarter and more relevant to the task. It will grow image processing in the geospatial market,” said George Percivall, OGC chief architect.

The OGC adopted WPS as a standard last year. “We’re now close to finishing the work on WPS,” said Sam Bacharach, OGC executive director.

Industry acceptance of WPS is widely expected. “WPS would be an element of any geospatial compute processing cloud,” said Kraska. “But with that standard, you still don’t have a cloud. You still need self-healing and automated management to have a cloud.”

There is also a difference between standards for interoperability and interaction. “The OGC has provided good standards for application-to-application interoperability. But standards for application-toapplication interaction are the necessary ground-up building of the SOA environment for cloud computing in geospatial,” said Bob Lozano, chief strategist and founder of Appistry.

As with most geospatial information and image processing, WPS requires significant compute power, which is where cloud computing could support the process. “Data compositing as performed by WPS may be computationally intensive and need to be spread across commodity hardware and software grids to enable its near realtime completion and delivery,” said Cuppan. “Cloud architecture concepts may allow us to address processing transients and the fluid plug-and-play objectives of coalition operations.”

VIRTUALIZATION MIDDLEWARE

Meanwhile, industry innovations and implementations plow ahead, not waiting for the standards. One example is the way in which GeoEye and Appistry are working together to cloud-enable key applications using Appistry’s Enterprise Application Fabric (EAF) to create a fault-tolerant cloud. “We’re using the Appistry EAF fabric to move to a distributed processing system,” Helmering said.

GeoEye supplies the algorithms and key software, while Appistry provides the EAF software. “We built frameworks on top of EAF that are customized for geospatial use. GeoEye is using it in production now for sensor data coming from a variety of satellites, taking raw data and creating refined product,” said Lozano.

“We’ve gone from prototyping to building real software,” he said. “We’re cloudenabling GeoEye’s product that massages incoming data so they can run their applications on private clouds to achieve higher scales more quickly at lower infrastructure and operational costs by using off-the-shelf hardware.”

EAF is virtualization middleware that is highly scalable and aggregated, pulling together different views of hardware to appear as one. “A key to cloud-enabling applications is how to aggregate the underlying resources, whether physical or virtual,” Lozano noted. EAF achieves fault tolerance by automatically reassigning work in the event of a processing problem.

Lozano said EAF can cloud-enable applications easily in some cases by simply adding application metadata to tell EAF what the target application is doing. “When the situation is more complex, it includes [application programming interface] calls that are independent of the cloud,” he said. Lozano believes a missing ingredient that is essential to the success of cloud computing is the ability to make applications feel native to the world of the cloud.

Using EAF is one way to cloud-enable applications. Another is Hadoop, an open source software platform for writing applications that process massive amounts of data on large clusters of commodity hardware. Hadoop uses the Hadoop Distributed File Systems (HDFS) to implement MapReduce, which divides applications into many blocks of work, while HDFS creates multiple replications of data blocks to achieve reliability. HDFS then places replications of data on nodes in a cluster.

NGA has studied Hadoop as a blueprint by which to take an OGC geospatial metadata standard called Catalog Web Service (CSW) and distribute it broadly. “CSW backend implementation details that enable clients to discover data and related processing services might be implemented using HDFS, which are central cloud processing constructs for Google and Yahoo. The CSW indices may be distributed across a broad computational lattice,” said Cuppan.

Both Amazon and Google have technology similar to Hadoop to cloud-enable applications, but it is kept proprietary as a way to encourage customers to bring their data to Amazon or Google’s cloud computing platform. Google’s Big Table is a compressed, high-performance RDMS built partly on the Google File System, which Google App Engine customers can access.

However fast and petabyte-scale the Google Big Table database is, developers could need to rewrite their Big Table applications. “If you write your application to make use of Big Table, it’s proprietary and you can’t redeploy the application extensively without rewriting it,” Lozano said, adding that Appistry is working on a standard approach to minimize source code changes made in order to cloud-enable an application.

FILE MANAGEMENT

File management and security are essential issues with large global geospatial files. To ease existing file management issues resulting from complex, multiple network attached storage (NAS) or storage area network (SAN) systems that might not scale well, it is usually necessary to integrate storage and file systems.

Establishing a common file system that is shared by Windows, Linux and Unix users can achieve seamless interoperability that carves a path to cloud computing. “In Windows, you would see a shared drive, like a D drive, that looks the same to one as another regardless of the underlying platform,” said IBM’s Ames.

In one example, IBM’s existing, tested technology, the Scale Out File System (SOFS), accomplishes this objective. The SOFS is an NAS/grid system that manages and broadly scales out NAS environments, in part by optimizing storage. SOFS utilizes the company’s General Parallel File System, a highly secure, long-tested and established high performance computing system.

IBM middleware, the Websphere Federation Server, contains an embedded RDMS and integrates remote, diverse data and content sources, making them appear as if they were the same database. The technology contains security management and user authentication. The SAIC Common Criteria Testing Lab successfully tested the security of Websphere Federation Server running on AIX and Red Hat Linux.

“Using Websphere Federation Server, a user writes a single query, WRS then optimizes the query, sends the request across the heterogeneous information stores, which then joins the necessary information and sends it back to the user,” explained Ames. There is no question that the desire to achieve improved exploitation of available intelligence is leading to consolidated cloud strategies for geospatial information. “We have all these intelligence capabilities across the globe. It makes sense to integrate by cross-tipping or cross information providing so analysts could discover other capabilities. One example is combining imagery intelligence with signals intelligence. If you have a fixed high resolution image from a given time and a moving, lower resolution image, then it’s stronger to have the fixed image plus the moving image to acquire a much richer intelligence view,” said Ames. “By putting the information silos into an integrated information cloud, you can make the information much more usable, as it is easier to get to.”

While NGA takes cloud computing seriously, the agency doesn’t have any unrealistic expectations about how easy it will be to deliver. “Placing GEOINT in an agile, high-performance computational cloud is certainly part of the strategy. However, that objective remains a vision and not yet committed to through the establishment of tangible acquisition programs,” said Cuppan. ♦

Back to Top

Upcoming Industry Events

GEOINT 2011 SHOW DAILIES


  GEOINT 2010 Symposium Show Dailies