8

Self-Management–The Defining Attribute of Next-Gen Architectures

By Mohana Krishna BG and Sangeetha S

The complexity crisis arising from the need to deploy, configure and adapt highly distributed, heterogeneous systems with a large number of complex elements to be resilient and resistant to changes, threats and failures - internal and environmental – in order to sustain guaranteed levels of service delivery has been a rapidly growing challenge. Further, it has the real potential of turning into a nightmare with the ever-increasing mission critical applications they serve and by the mobility and ubiquity requirements of the emerging genre of systems that live and operate in uncertain, or even hostile, environments. It becomes prohibitively expensive, time-consuming, and error-prone to manually monitor all possible run-time interactions among the components and the resulting conditions, to assess their impact, and to dynamically manage any undesirable outcomes.

A way forward that offers hope and promise on the one hand, and draws skepticism (on account of the current state of technology) on the other, is “self-managed” systems. These are systems capable of self-awareness, self-configuration, self-adaptation, self-healing, and so on, in order to be flexible and robust in the face of change, based on high-level guidance by a human administrator. These properties often go under the “self-” moniker, and such systems have also been widely referred to as autonomic systems

Self-management - Characteristics and Approaches

A key manifestation of autonomic behavior is the ability of the system to translate high-level, broadly-scoped policies and goals with respect to functional as well as non-functional behavior into concrete run-time actions. Policies represent required behavior or constraints that have to be satisfied by the system in order that specified goals are met, through dynamic adaptation of the operational parameters. When unable to do so in certain exceptional cases, the system should report its inability to do so, and should perhaps degrade gracefully.

The essential characteristics of self-managed systems can be broadly classified under the following four headings:

Self-configuring The ability of the system to adapt itself, with need for minimal intervention, to changes within the system or in its environment based on high level policies and goals, such as a business policy or goal.
Self-optimizing The ability of the system to tune typically a large numbers of its own parameters to maximize its performance while minimizing resource demand leading to increased operational efficiency.
Self-healing The ability of the system to recover from failures and to repair itself by analyzing operational parameters that led to a failure, and taking appropriate corrective action to prevent future disruptions.
Self-protecting The ability of the system to anticipate, detect and protect itself from cascading failures and malicious external attacks.

The vision of self-managing systems draws inspiration and ideas from a study of systems in Nature with apparently similar behavior, and human endeavors such as warfare and gaming. However, a key distinction is that while autonomic behavior in the human body is largely involuntary, the self-management capabilities manifested in autonomic systems are driven by response plans and policies explicitly configured by human administrators to adequately respond to change - anticipated or otherwise. This approach minimizes overall complexity through encapsulation and emancipation of the system’s (self-) management mechanisms from frequent low-level human intervention, thereby freeing up systems management professionals’ bandwidth for areas of higher value to business.

Another approach [5] to self-managed systems corresponds to the OODA (Observe-Orient-Decide-Act) cycle (Figure 1) originally developed by John Boyd, a USAF military strategist, for military combat operations process, and which later also found applications in other areas such as business and learning process optimization. Considering that it becomes highly infeasible to anticipate all possible change scenarios in a complex system and its environment, such as highly fluctuating loads, unforeseen events, and resource availability of the system that potentially deviate the system’s performance from specified goals, this model advocates planning that involves continuous monitoring of current state and reevaluation of available alternatives as well as making heuristic compromises to best address the goals, perhaps in an opportunistic manner.

A notably striking similarity among the approaches is that the self-management ability is the outcome of some form of a closed loop control mechanism at their core, which consists of monitoring-sensing-anticipating change, modeling-analyzing-planning, and responding-affecting with corrective actions to keep the system within given constraints.

 

pearson

Figure 1. OODA Loop (Source: Wikipedia)

Architecture of self-managing systems

While the research community largely leans towards a “top-down” (starting with first principles and arriving at a design) approach that advocates new, specialized kinds of platforms, middle ware and language support, the industry has generally preferred a “bottom-up” approach that augments existing platforms and middle ware with adaptive capabilities in an evolutionary fashion to minimize risk, and to protect investments in existing platforms and applications. The two approaches may eventually converge with the emergence of patterns and maturation of industry standards to yield an open architecture that caters to heterogeneous vendor environments.

In general, the architecture should support autonomic capabilities that span the system’s functional and non-functional aspects. From a slightly different perspective, self-management may itself be viewed as an essential non-functional attribute that system architecture should enable though supporting capabilities such as self-monitoring, rigorous analysis, dynamic reconfiguration or re-composition, and hot deployment.

IBM’s Autonomic Computing Initiative [4] incorporates the approaches outlined in the previous section and is currently the most visible and coherent effort towards realizing the vision of self-managed systems, going by the following it has gained among the research and industry communities. It also tends to subsume architectures emerging from several other parallel research efforts [2]. While seemingly ambitious given the current state of technology and standardization efforts, it serves as a blueprint that provides a comprehensive conceptual view that can realistically aid in evolving system-level autonomic capabilities across heterogeneous platforms and technologies with progress in standardization efforts.

To avoid overwhelming complexity, and to achieve other desirable architectural qualities such as better reuse, flexibility, reliability and scalability, it takes a decentralized approach to self-management in which the system is structured as a hierarchical collection of loosely-coupled autonomic components. These components are able to manage their own behavior, and their interactions with other elements, through sensing and interpretation of their local state and the state of their immediate environment. Each autonomic component (hardware or software) can be viewed as being composed of an autonomic manager and a managed resource, following a consistent structure as shown in Figure 2.

 

pearson

Figure 2. Autonomic Component Architecture

The managed resource is a piece of software or hardware that contributes to the overall functionality of the system. The autonomic manager implements an intelligent control loop that has four parts, each with its own specialized function and mechanisms, and is supported by a Knowledge store during its execution.

Monitor Continuously tracks the managed resource for changes through Sensors or probes implemented in hardware or software as appropriate for the managed resource and collects, cleanses, records, and provides reports and notifications on the data.
Analyze Infers the significance and impact of changes using static and dynamic models, and triggers the other functions of the autonomic control loop, or other autonomic managers, to respond to the change. Its primary role is problem diagnosis and forecasting based on the models.
Plan Determines the course of action to be followed to manage changes based on objectives, constraints, goals and policies specified.
Execute Carries out the determined plan through actions such as repair, reconfiguration, or redeployment. The interface required to propagate the actions to the managed resources is provided by Effectors, which also may be implemented in hardware or software, or a combination thereof, as appropriate for the managed resource under consideration.
Knowledge The data required for their operation by the above four parts of an autonomic manager is stored as shared knowledge and can include items such as performance metrics, thresholds, topology information, analysis models, and policies.

 

pearson

Figure 3. Layered architecture for self-managed systems

A generalized reference architecture for composing a self-managing system from autonomic components is depicted in Figure 3. As in any component-based architecture, there can be varied relationships among components such as association, dependency, aggregation, composition, generalization, implementation, manifestation, etc. Although each autonomic component broadly conforms to the architecture shown in Figure 2, the degree of sophistication of its control loop and the extent of its influence on the overall system functioning can vary based on the role performed by the component within the overall architecture, its position in the hierarchy vis-à-vis other components, and the nature of its relationships with other components in the system. Components in successively higher layers are designed to address a wider scope than ones at lower levels. As a result, they are accountable for decisions and tradeoffs that have to be made from a broader perspective than individual components, or even from the local view of a given set of components. Extending this argument, components at the highest level are responsible for adjudication at the level of the overall system and its environment.

The lower four layers together encapsulate the self-management capabilities of the system, with successively broader scope of concerns and sophistication at each higher layer. In other words, components at the lower layers tend to exhibit limited, hard-wired autonomic behavior that is often inward-focused, while it tends to be more dynamic, flexible and goal-oriented at the higher levels. As with other layered architectures, it is to be understood that the order of layering described here is more logical than physical, and it is often likely that the actual distribution of the system’s autonomic capabilities across components may not physically correspond to this structure.

The lowest layer represents system components that implement the application logic and other fundamental system qualities such as performance, scalability, reliability, availability, etc., optionally with limited embedded self-management characteristics. The next three layers constitute the core of the system’s autonomic clout, and their individual functions are described below. The top-most layer consists of dashboards that provide dynamic system status, notifications and alerts indicating extra-ordinary events that are beyond the ability of the lower layers to handle and which call for explicit human intervention, and exposes administrative interfaces to allow maintenance and manual control of the system operation.

Component Monitoring and Control

This layer essentially consists of autonomic managers tied to individual managed resources. They are responsible for gathering and operating on the run-time parameters of the associated resource, either through execution of their own control loop, and/or firing notifications to other interested autonomic managers, possibly at higher layers. They also apply actions determined by their own control loops, or propagated from higher layers, to the managed resource through effectors.

Component Configuration Management

While not necessarily the case, this layer consists of autonomic managers whose interest/influence span multiple components in the lower layer. In fact, orchestration among multiple autonomic managers to achieve specific objectives is a key responsibility of this layer. The autonomic managers in this layer acquire data necessary for their operation in multiple ways - from lower level autonomic managers in the form of event notifications (“push”), by actively monitoring them (“pull”), and by receiving mandates from the layer above which are processed to translate them into more concrete directives to the layer below. Typically, the results of execution the control loops of the autonomic managers in this layer exert indirect influence on the managed resources through the mediation of the autonomic managers that they control, and manifest mainly in the form of configuration changes to the system which aid one or more of the “self-” capabilities of the system. In case of exceptional events for which components at this layer are unable to effectively respond, the event is delegated to the layer above through a notification mechanism.

Autonomic Goal and Policy Management

This layer is responsible for translating abstract intents expressed as goals, policies or constraints into actionable mandates. It achieves this by tracking certain key operational parameters (the “SLA parameters”) of the system to identify deviations from specified metrics for the system’s operation defined by system administrators and/or business users, and any violations of constraints that occur. The autonomic managers at this level contain models to diagnose causes of such events, and strategies and mechanisms to contain them. As with the lower layer, notifications are triggered to the higher layer for exceptions that are beyond the ability of this layer to handle.

Self-managing Systems – Current State and Emerging Trends

Acutely aware of the rapidly growing importance and imminent need for self-management capabilities in a variety of domains such as enterprise IT systems management, cloud infrastructures, SOA systems, mobile and pervasive computing, many vendors, industry groups and open source communities are making feverish efforts and substantial investments to develop infrastructure and middleware products, frameworks, tool support, standards, best practices, reference implementations, etc. to accelerate implementation of self-managed systems. The demand from users for such solutions and support also has been on an upswing, considering the potential competitive advantages that they could bring. However, it is important to note that there exists today a yawning gap between the current state of autonomic technology and the projected vision of self-managing systems. While the challenge has engaged the academic research community for quite a while, the industry finds itself just waking up to it with a sudden sense of urgency in an all too familiar “catch-up” mode. Not surprisingly, a lot of dust is being kicked up and it will be sometime before clear direction and practical solutions emerge.

The acceptance of autonomic computing as a mainstream technology is likely to follow a similar trajectory as the other emerging technologies such as cloud computing, SOA, and pervasive computing - with open standards specifications, technology and processes likely to take several years to mature. On the positive side, the aforesaid emerging technologies too seem to be grappling with a common subset of issues related to heterogeneity and the need for standardization, and progress with any of them will accelerate their advancement as a whole.

Due to the current market volatility, and the limited scope of this article, we are forced to confine ourselves to a peek at a few endeavors that appear to provide glimpses of the autonomic systems of the future. Their choice is based on considerations such as connect to existing and emerging realities, open standards orientation, and clarity of their roadmap. However, we make no claims about being comprehensive, or our choices being most representative. The references section points to several resources that provide more detailed information.

Establishment and conformance to standards is a key success factor, since complexity arising out of heterogeneity is a key focus area of autonomic systems. Several products already exist in the market with varying levels of self-management abilities, albeit implemented in proprietary ways. Such capabilities do help to an extent in terms of providing islands of autonomic ability but, at best, fulfill a minor role with respect to the ocean of self-management adeptness of an overall system. For them to contribute tangibly to system level self-management behavior, they need to be intentionally designed to fit a standard architecture devised for such a mandate. The standardization of the architecture and various mechanisms used within it becomes critical considering the bewildering variety and complexity of the elements of end-to-end IT applications stacks in existence today.

As mentioned earlier IBM’s Autonomic Computing Initiative appears to have made considerable progress with major contributions from multiple university research labs. Much of the work including a set of technologies, open-source libraries, tools, documentation, examples and scenarios for development of self-managing applications has been published and is available in the public domain. Microsoft’s Dynamic Systems Initiative (DSI), claimed to be a parallel industry initiative for standards-based autonomic computing initiative has seen sporadic activity. Although, it appears to be largely Windows Server-centric, it is claimed to also support other OS platforms such as Solaris and Linux. However, a clear and consistent roadmap and strategy do not appear to be forthcoming.

Efforts to build adaptive capabilities in Java EE servers seem to have been initiated by the academic community [3] [6]. These were followed by open-source community [7] [8] [9] [10], and industry initiatives [11] [12] [13]. It must be noted, however, that most of these efforts have not attempted standardization beyond the level of the relatively rudimentary JMX and JSR77. In parallel, a few enabling technologies for self-management seem to have gained a certain degree of traction, with their applicability limited to specific contexts. For example, OSGi [14] - although specific to Java - appears to have gained a fair degree of industry acceptance as a technology for enabling dynamic extensibility and re-configurability of systems.

The biggest shortcoming of all these efforts towards realization of distributed, heterogeneous, self-managing systems at present is the lack of broad-based standards making efforts. Although standards organizations such as DMTF [15] and OASIS [16] have taken the first steps in that direction through standards such as CIM, WSDM and SDD, their scope and applicability remain relatively miniscule considering the vast range of issues that need to be addressed.

It is universally acknowledged that the only pragmatic route to the autonomic computing goal for an enterprise is through a phased implementation as technologies and standards evolve, and a well-orchestrated transition to successively higher levels.

Conclusion

Employing technology to manage technology - as opposed to the current manually intensive methods - will become the inevitable choice. The goal of self-managing systems is not to remove human intelligence completely from the equation, but to channel the need for human intervention to areas that offer a better bang for the buck as the demand for robust systems gets louder on the one hand, while their complexity seems to soar on the other. While the industry seems grossly unprepared yet to meet the challenge, the stakes involved and the inspiration provided by the genius of systems in Nature are sufficient motivation to do so.

References

1. Manish Parashar and Salim Hariri, Autonomic Computing - Concepts, Infrastructure, and Applications, CRC Press, 2007

2. Jeff Kramer and Jeff Magee, Self-Managed Systems: An Architectural Challenge, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.104.739&rep=rep1&type=pdf

3. Ian Gorton, Yan Liu, Nihar Trivedi, An Extensible, Lightweight Architecture for Adaptive J2EE Applications http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.84.8213&rep=rep1&type=pdf

4. An architectural blueprint for autonomic computing

5. www-03.ibm.com/autonomic/pdfs/AC%20Blueprint%20White%20Paper%20V7.pdf

6. Virtualized Execution Realizing Network Infrastructures Enhancing Reliability (VERNIER) http://csl.sri.com/projects/vernier/

7. Yan Liu, Enabling Adaptation of J2EE Applications Using Components, Web Services and Aspects, http://portal.acm.org/citation.cfm?id=1175864

8. JASMINe - http://wiki.jasmine.ow2.org/xwiki/bin/view/Main/WebHome

9. http://sourceforge.net/projects/starmx/

10. http://java-source.net/open-source/jmx

11. http://servicemix.apache.org/jmx-console.html

12. http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/jmxperience.jsp jmxperience.jsp

13. http://weblogs.java.net/blog/2006/02/10/self-management-framework-glassfish

14. http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html

15. http://osgi.org

16. http://dmtf.org/

17. http://oasis-open.org/

Authors

 

pearson Mohana Krishna BG ([email protected]) is a Lead Principal, an educator and mentor in Architecture Competency stream in the Education and Research group. With 25+ years of experience in the IT industry and academia, he nurtures a vibrant community of architects at Infosys.

 

pearson S Sangeetha ([email protected]) is a Principal at the E-Commerce Research Labs, E&R. She has over 12 years of experience in design and development of Java and Java EE applications. She has co-authored a book on ‘J2EE Architecture’ and also has written numerous articles for online Java forums.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset