Grid computing and XTP are the new integration technologies, which are likely to become increasingly popular over the next few years.
Grid computing is the term used to describe all the methods that combine the computing power of a number of computers in a network, in a way that enables the (parallel) solution of compute-intensive problems (distributed computing), in addition to the simple exchange of data. Every computer in the grid is equal to all the others. Grids can exceed the capacity and the computing power of today's super computers at considerably lower cost, and are also highly scalable. The computing power of the grid can be increased by adding computers to the grid network, or combining grids to create meta grids.
Definition of a grid
A grid is an infrastructure enabling the integrated, collaborative use of resources which are owned and managed by different organizations (Foster, Kesselmann 1999).
The following diagram illustrates the basic model of grid computing, with the network of computers forming the grid in the middle:
The main tasks of grids are:
Grids have the following features which allow more sophisticated Service Level Agreements (SLA) to be set up:
Grids can be broken down into data grids, in-memory data grids, domain entity grids, and domain object grids on the basis of their primary functionality.
A data grid is a system made up of several distributed servers which work together as a unit to access shared information and run shared, distributed operations on the data.
In-memory data grids are a variant of data grids in which the shared information is stored locally in memory in a distributed (often transactional) cache. A distributed cache is a collection of data or, more accurately, a collection of objects that is distributed (or partitioned) across any number of cluster nodes, in such a way that exactly one node in the cluster is responsible for each piece of data in the cache, and the responsibility is distributed among the cluster nodes.
Competitive data accesses are handled cluster-wide by the grid infrastructure, if a specific transactional behavior is required. The advantages include the high levels of performance possible as a result of low latency memory access. Today's 64-bit architectures and low memory prices allow larger volumes of data to be stored in memory where they are available for low latency access.
However, if the memory requirements exceed the memory available, "overflow" strategies can be used to store data on a hard disk (for example, in the local filesystem or local databases). This will result in a drop in performance caused by higher latency. The latest developments, such as solid state disks, will in future allow a reasonable and cost-effective compromise in this area, and will be an ideal solution in scenarios of this kind.
Data loss caused by a server failing and, therefore, its area of the memory being lost, can be avoided by the redundant distribution of the information. Depending on the product, different distribution topologies and strategies can be selected or enhanced.
In the simplest case, the information is distributed evenly across all the available servers.
An in-memory data grid helps the application to achieve shorter response times by storing the user data in memory in formats which are directly usable by the application. This ensures that storage accesses with low latency and complex, time-consuming transformations and aggregations when the consumer accesses the data can be avoided. Because the data in the grid is replicated, buffering can be used to accommodate database failures, and the availability of the system is improved. If a cluster node in the data grid fails, the data is still available on at least one other node, which also increases availability. Data will only be lost in the case of a total failure, and this can be counteracted by regular buffering to persistent storage (hard disk, solid state disk, and so on).
Domain entity grids distribute the domain data of the system (the applications) across several servers. As these are often coarse granular modules with a hierarchical structure, their data may have to be extracted from several different data sources before being made available on the grid across the entire cluster. The data grid takes on the role of an aggregator/assembler which gives the consumers cluster-wide, high-performance access to the aggregated entities. The performance can be further improved by the grid by initializing the domain data before it is actually used (pre-population).
A domain object grid distributes the runtime components of the system (the applications) and their status (process data) across a number of servers. This may be necessary for reasons of fail-safety, and also because of the parallel execution of program logic. By adding additional servers, applications can be scaled horizontally. The necessary information (data) for the parallelized functions can be taken from shared data storage, (although this central access can become a bottleneck, which reduces the scalability of the system as a whole) or directly from the same grid or a different grid. It is important to take into account the possibilities of individual products or, for example, to combine several products (data grid and computing grid).
Different distribution topologies and strategies are available, such as replicated caches and partitioned caches (Misek, Purdy 2006).
Data and objects are distributed evenly across all the nodes in the cluster. However, this means that the available memory of the smallest server acts as the limiting factor. This node determines how large the available data volume can be.
Advantages:
Disadvantages:
The disadvantages must be compensated for as far as possible by the grid infrastructure, and circumvented by taking appropriate measures. This should be made transparent to the programmer by using an API which is as simple as possible. The implementation could take the form of local read-only accesses without notification to the other cluster nodes. Operations with supervised competitive access require communication with at least one other node. All the cluster nodes must be notified about update operations. An implementation of this kind results in very high performance and scalability, together with transparent failover and failback.
However, it is important to take into consideration that replicated caches requiring a large number of data updates do not scale linearly in the case of potential cluster growth (adding nodes), which involves additional communication activities for each node.
Partitioned caches resolve the disadvantages of replicated caches, relating to memory and communications.
If this distribution strategy is used, several factors must be taken into account:
Agents are autonomous programs that are triggered by an application, and are executed on the information stored in the grid under the control of the grid infrastructure. Depending on the product, specific classes of programming APIs may need to be extended or implemented for this purpose. Alternatively, declarative options allow agent functionality of this kind to be established (for example, using aspect-oriented methods or pre-compilation steps). Predefined agents are often provided with particular products.
Let's take a brief look at these execution patterns:
Grid technology can be used in a variety of different ways in architectures:
As a result of the need for complex processing of large and very large volumes of data (for example, in the field of XML, importing large files with format transformations, and so on.), new distributed storage architectures with parallel application access functions have been developed in recent years.
A range of different cross-platform products and solutions is available, also known as "extreme transaction processing" or XTP. The term was coined by the Gartner Group, and describes a style of architecture which aims to allow for secure, highly scalable and high-performance transactions across distributed environments on commodity hardware and software.
Solutions of this kind are likely to play an increasingly important role in service-oriented and event-driven architectures in the future. Interoperability is a driving force behind XTP.
Distributed caching mechanisms and grid technologies with simple access APIs form the basis for easy, successful implementation (in contrast to the complex products widely used in scientific environments in the past). Although distributed cache products already play a major role in "high-end transaction processing" (an expression coined by Forrester Research), their position in the emerging Information-as-a-Service (IaaS) market is expected to become more prominent.
New strategies for business priority have been introduced by financial service providers in recent years. Banks are attempting to go beyond the limits of their existing hardware resources and develop increasingly high-performance applications, without having to invest in an exponential increase of their hardware and energy costs.
The growth of XTP in areas such as fraud detection, risk computation, and stock trade resolution is pushing existing systems to their performance limits. New systems which should implement this challenging functionality require new architecture paradigms.
It is clear that SOA, coupled with EDA and XTP, represents the future for financial service infrastructures as a means of achieving the goal of running complex computations with very large volumes of data, under real-time conditions. XTP belongs to a special class of applications (extreme transaction processing platforms) that need to process, aggregate, and correlate large volumes of data while providing high performance and high throughput. Typically, these processes produce large numbers of individual events that must be processed in the form of highly volatile data. XTP-style applications ensure that transactions and computations take place in the application's memory, and do not rely on complex remote accesses to backend services, in order to avoid communication latency (low latency computation). This allows for extremely fast response rates while still maintaining the transactional integrity of the data.
The SOA grid (next generation, grid-enabled SOA) is a conceptual variant of the XTPP (Extreme Transaction Processing Platform). It provides state-aware, continuous availability for service infrastructures, application data, and process logic. It is based on an architecture that combines horizontally scalable, database-independent, middle-tier data caching with intelligent parallelization, and brings together process logic and cache data for low latency (data and process affinity). This enables the implementation of newer, simpler, and more efficient models for highly scalable, service-oriented applications that can take full advantage of the possibilities of event-driven architectures.
XTP and CEP are comparable, in that they both consume and correlate large amounts of event data to produce meaningful results.
Often, however, the amount of event data that needs to be captured and processed far exceeds the capacity of conventional storage mechanisms ("there just isn't a disk that can spin fast enough"). In these cases, the data can be stored in a grid. CEP engines can be distributed across this data and can access it in parallel. Analyses can be carried out, and business event patterns can be identified and analyzed in real-time. These patterns can then be processed further and evaluated using Business Activity Monitoring (BAM).
Solid State Disk (SSD) technology is developing at high speed. Data capacities are increasing rapidly and compared with conventional drives, the I/O rates are phenomenal. Until now, the price/performance ratio per gigabyte of storage has been the major obstacle to widespread use. It is currently a factor of 12 of the cost of a normal server disk, per gigabyte of storage. The major benefit for data centers is the very low energy consumption, which is significantly less than that of conventional disks.
Because of their low energy requirements, high performance, low latency, and the expectation of falling costs, SSDs are an attractive solution in blades or dense racks. One interesting question concerns the influence which SSDs may have on data grid technology.
Disk-based XTP systems can benefit from the introduction of an SSD drive However, SSDs currently have a much lower storage capacity (128 GB versus 1 TB) than conventional disks. Nevertheless, this is more than the capacity of standard main memory, and SSDs are also less costly per gigabyte than memory. The capacity of SSDs is lower than that of conventional disks by a factor of 10, and higher than the capacity of memory by a factor of 8.
SSDs bridge the gap between memory-based and disk-based XTP architectures. SSD-based architectures are slightly slower than memory-based systems, but significantly faster than the fastest disk-based systems. The obvious solution is, therefore, to provide a hierarchical storage architecture in XTP systems, where the most volatile data is stored in memory, data accessed less often is stored on SSDs, and conventional disk-based storage is used for long-term persistent data. It also seems reasonable to store memory overflows from memory-based caching on SSDs.