Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 20. Automatically Adapting Your Environment

<feature><title>In This Chapter</title> <objective>

</objective> <objective>

Automatically Adapting with Operations Manager

</objective> </feature>

In this last part of the book, we move from more quotidian or mundane discussions of how Microsoft presents using Operations Manager 2007 to thinking “outside of the box.” Whereas earlier chapters covered topics such as planning, installing, configuring, administering, and monitoring, this chapter will look at some more out-of-the-ordinary approaches for using the product, perhaps stretching your horizon to look at Operations Manager in new and different ways.

As we have discussed throughout this book, Operations Manager 2007 (OpsMgr) is most commonly used to identify conditions that occur, providing notification if these conditions take place. Generically speaking, the concept of monitoring is, “Watch my back (uh, environment) and let me know what’s going on.” Monitoring with OpsMgr allows us to react more quickly than we would otherwise. Once we are aware of an issue, OpsMgr can provide product knowledge to help us more quickly resolve the problem. The user community also assists with problem solving—there are numerous articles and blogs discussing the use of the product.

A recent development in the user community is a new resource called the ReSearch This! Management Pack (RTMP). The RTMP provides a shared community-based knowledge repository for OpsMgr, System Center Essentials, and MOM 2005. You can download the Ops2007 version of the RTMP at http://systemcenterforum.org/wp-content/uploads/ReSearchThisOpsMgr.zip. A version for MOM 2005 is available as well (http://systemcenterforum.org/wp-content/uploads/ReSearchThisMOM.zip). These links are also available as live links in Appendix E, “Reference URLs.” The RTMP includes tasks that search the SystemCenterForum repository for the alert you specify and link to the SystemCenterForum repository so you can share your resolutions with the community.

Using the RTMP, we can choose an action from the Alert Tasks pane (shown in Figure 20.1), which will search for the alert in the SystemCenterForum shared repository.

Figure 20.1. RTMP task to search for an alert.

Known resolutions for the alert, if available, will display using the format shown in Figure 20.2.

Figure 20.2. Results of the RTMP task to search for an alert.

After an issue is resolved, information on its resolution can be stored in company knowledge and/or shared with the community using the RTMP with the screen shown in Figure 20.3. These two approaches work together to allow an administrator running OpsMgr the means to resolve issues more quickly than would be possible without using a monitoring solution.

Figure 20.3. RTMP task results for sharing an alert resolution.

In addition to identifying conditions that have occurred, OpsMgr provides functionality to help proactively monitor your environment—a better approach to operations management than just resolving problems in reactive mode. Here are two examples:

As a simple example, it is far easier to fix a drive that is low on free space by freeing up additional space than it is to recover from a crash of the system caused by running out of free space.
A more complex illustration is trending system usage to predict when systems may become bottlenecked. To illustrate, if a server is at 50% of memory utilization in January, trends to 60% utilization by March, and 70% utilization in June, it is likely that memory utilization will become the bottleneck prior to September.
Utilizing trending with OpsMgr allows us to be proactive by identifying bottlenecks and resolving them before they affect performance or functionality provided to end users.

OpsMgr enables you to more quickly identify and resolve problems, becoming more proactive in monitoring issues; yet this only scratches the surface of the functionality available with the product. This chapter will tie it all together, discussing what you can accomplish above and beyond the monitoring aspects of OpsMgr. We will examine how you can use OpsMgr to automatically adapt to your environment. We will also discuss how OpsMgr can integrate with other products and provide examples of an environment that can automatically adapt by using Operations Manager 2007. Consider what we are emphasizing throughout as the art of adapting.

Operations Manager Functionality

Operations Manager 2007 provides many different capabilities for automatically adapting your environment. These functions include diagnostics, recoveries, notifications, and console and agent tasks. OpsMgr also provides functionality that is used dynamically. An example of this is through computer groups, which also lend themselves toward the ability to adapt automatically to changes in the IT environment.

We will first look at two commonly overlooked functions available within OpsMgr: diagnostics and recoveries.

Diagnostics and Recoveries

As we discussed in Chapter 14, “Monitoring with Operations Manager,” Operations Manager 2007 uses monitors. Differing from rules, monitors present the state of objects within OpsMgr. When a monitor changes from a healthy state to a nonhealthy state, such as a warning or error state, OpsMgr can activate diagnostic and recovery tasks.

A diagnostic task assists with determining why the change in state occurred. As an example, if there was an issue with Active Directory replication, you could configure a diagnostic task to run that would automatically perform an analysis on replication, such as the repadmin.exe /replsum command. You can configure diagnostic tasks to run automatically or on demand, depending on your requirements:

Make use of automatically run diagnostic tasks when you want to gather information from the system at the time when the change in state occurred (such as the repadmin example in the preceding paragraph).
Utilize on-demand diagnostic tasks when there are common tasks that will occur when the system is in an error state. Different diagnostic tasks can be defined for the different states (such as warning or critical), and multiple diagnostic tasks can be defined on a per-state level. A simple example of an on-demand diagnostic task would be a ping test, which would validate that the system can at least be reached via ping (in the case of a web server, a send or get command test could be used).

A recovery task performs an action that should reverse the state change that occurred. As an example, if there was an issue with the print server spooler service, you could configure a recovery task to restart the service automatically. A more in-depth example of this would gather the job information currently in the queue (to assist in debugging why the queue failed or to know the print jobs affected) and then restart the service. As with diagnostic tasks, you can define multiple recovery tasks for each state level. You can also configure recovery tasks that automatically reset the monitor to a healthy state.

Tip: Diagnostics and Recoveries Online

Microsoft provides a series of webcasts to assist with learning the key features within OpsMgr. These videos are available at http://technet.microsoft.com/en-us/opsmgr/bb498237.aspx.

A video that Microsoft created specifically about diagnostics and recoveries is available at http://www.microsoft.com/winme/0701/28666/Diagnostics_and_Recoveries_Demo.asx.

Diagnostic and recovery tasks are available only within monitors, and they are configured within the monitor properties on the Diagnostic and Recovery tab. Figure 20.4 shows an example of this tab on a monitor (this particular case uses the Active Directory Availability Health Rollup monitor). You will find monitor properties either through the Health Explorer or in the Authoring space of the Operations console.

Figure 20.4. The Active Directory Availability Health Rollup monitor’s Diagnostic and Recovery tab.

Diagnostic and recovery tasks are available when there is a state change for the monitor. The tasks are accessed by opening the Health Explorer for the generated alert (or you can navigate to Authoring -> Management Pack Objects -> Monitors). Next, click the monitor, click the State Change Events tab, and then scroll down on the right-side panel to find any available diagnostic and recovery information, including the results of any diagnostic or recovery tasks configured to run automatically. Figure 20.5 illustrates an example of this information in the Health Explorer for a Health Service Heartbeat Failure.

Figure 20.5. Diagnostic and recovery tasks shown within the Health Explorer.

Notice the Details pane (bottom right) shows a diagnostic task called Check If Health Service Is Running and a recovery task called Reinstall Health Service Manually.

Creating diagnostic and recovery tasks is fairly simple. Open the monitor where you want to define the action to occur (shown in Figure 20.4); on the Diagnostic and Recovery tab, click the appropriate Add... button to configure either diagnostic or recovery tasks.

Creating Diagnostic Tasks

Perform the following steps to create a diagnostic task:

Click the Add... button under Configure diagnostic tasks and then choose which state you will create the diagnostic for (shown in Figure 20.6).
Figure 20.6. Determining the criticality level for the diagnostic being created.
After you identify the criticality level for the diagnostic (critical or warning), the only type of diagnostic task available is Run Command. You will also specify the management pack to store the diagnostic in (as with storing overrides, we do not recommend using the Default management pack!).
Next, the Create Diagnostic Task Wizard asks you to define the name, description, and the target of the diagnostic task, and whether the task will start automatically when the state changes to the criticality level you defined.
Figure 20.7 shows an example of creating a diagnostic task to determine top disk space usage.
Figure 20.7. Defining the name, description, and target for the diagnostic task.
On the Command Line page, specify the command-line execution settings. In Figure 20.8, we show a diagnostic task with a script that reports on the top space usage on the drive (diskuse /s does a great job, and it’s free for download as part of the Windows Server 2003 Resource Kit at http://technet.microsoft.com/bb693323.aspx; the full link to the Microsoft download site is available in Appendix E). The script specified here provides information to help with a faster problem resolution for the lack of disk space on the system.
Figure 20.8. Defining the script to run for the diagnostic task.

Creating Recovery Tasks

The process for creating a recovery task is similar to the process of creating a diagnostic task. Create the recovery task on the monitor where you want to define the action to occur (shown in Figure 20.4). Perform the following steps:

On the Diagnostic and Recovery tab, click the Add... button under Configure recovery tasks and then choose the state to create the recovery task for. You can define recovery tasks to run either a command line or a script. In Figure 20.9, we show a recovery task defined to clear additional disk space on a drive. This recovery task is set to run automatically and to reset the monitor after the recovery is complete.
Figure 20.9. Defining the name, description, and target for the recovery task.
This configuration means that when the (critical) condition occurs, the script will run and reset the monitor to a healthy state. The state will update again the next time the monitor is scheduled to run; for the Windows 2003 Logical Free Space monitor, it will run once an hour.
Figure 20.10 shows a recovery task with a script defined to either remove or relocate files on the drive (with an increased value for the timeout from 15 seconds to 120 seconds). This example of a recovery task runs a script that removes or relocates temporary files to help resolve the free disk space issue on the drive that is in a critical state.
Figure 20.10. Defining the script to run for the recovery task.
Diagnostic and recovery tasks provide Operations Manager with the capability to perform automated actions based on the change of the state of objects that OpsMgr is monitoring. They also provide a core capability for creating an environment that can automatically adapt to changes.

Auto-restarting the Health Service

When Operations Manager 2007 was first released, there was a common situation where agents stopped functioning and went to a gray state because the OpsMgr Health service failed. This situation regularly occurred on a large number of systems over time. If this occurs, you may choose to manually recover using a recovery task (you could also open the Computer Management MMC [Microsoft Management Console] and restart the service, but that doesn’t fit our example, so we will disregard that).

If you prefer to have the service auto-start itself, you can configure an override. Open the Health Service Heartbeat Failure monitor in the Operations console (select Authoring -> Management Pack Objects -> Monitors and then find the monitor by name). On the Diagnostics and Recovery tab, highlight the Restart Health Service recovery task and select Edit. Create an override that selects the Enabled option and sets the Override Setting to True.

A quick way to determine whether this is functional is to stop the OpsMgr Health service on a system, wait several minutes (by default this is three heartbeat intervals of 60 seconds and the time it takes to activate the recovery task, or about 5 minutes in our test case example). See if the Health service restarts as part of the OpsMgr recovery functionality.

To actually stop the Health service on a system, you would either remove this override or stop and then disable the service.

Notification

In Chapter 14, we discussed the improvements in how notification workflow occurs in this version of Operations Manager. Notification can be performed using a variety of methods, with the most common being email. Other available techniques include instant messaging, Short Message Service, and command notification. These different formats are notification channels. The command notification provides an important capability to consider in an adaptable OpsMgr.

The configuration for command notification is available in the Operations console in the Administration space, under Settings -> Notification. Select the Command tab, as shown in Figure 20.11.

Figure 20.11. The Command Notification screen.

You can define multiple notification channels, which perform custom actions. In general, notifications provide information that will require human intervention, using a format such as email or instant messaging.

Command notification opens up a set of more complex potential functions that you might use. A command channel can run a script or an executable file. For example, a notification could be defined that causes a file to be created, which could become part of a workflow. Alternatively, a portal could be updated, with its content reflecting a list of users whose accounts are currently disabled. The limits to this are restricted only to what you can script!

As an extreme example, Figure 20.12 shows how you could create a notification that would cause an audible alert to occur when the notification channel is used. As an example, we imagine the robot from Lost in Space saying “Danger, Will Robinson” for the audible alert, but hey that could just be us.

Figure 20.12. Creating an audible alert command.

Notification not only provides a method to present information for human intervention, but can be used as part of a workflow to integrate more automated interactions. Notifications (particularly command notification) are a key piece of functionality required to adapt automatically to changes in your environment.

Computer groups in Operations Manager 2007 gather computers into logical units that can be used for a variety of reasons, including scoping the Operations console and providing criteria for notifications.

Computer Groups

Operations Manager 2007 computer groups are defined with either static membership or through dynamic membership:

Static membership is when a system is added by name to the computer group.
Dynamic membership is when a computer is determined to be part of a computer group, based on some property of the computer (such as an Exchange server having Exchange services installed).

Computer groups within OpsMgr are very flexible, and you can use them to provide a variety of functions. These functions include the ability to logically group together disperse systems based on specified criteria. As an example, you might define a computer group based on the Active Directory (AD) site it belongs to. Creating custom groups based on AD sites provides an easy method to scope the OpsMgr console based on the sites that specific administrators are responsible for, or to mass resolve any alerts created by the loss of a link between sites. To create computer groups based on AD sites, perform the following steps:

First, get a list of your AD sites by using the Active Directory Sites and Services tool (we will use Plano as a sample site name for this example). Next, open the Operations console under Authoring -> Groups. Right-click and create a new group. Enter a friendly name (for example, Odyssey Plano Computers) and a description.
Choose the management pack to store this in (preferably create a separate management pack versus using the Default management pack). Click Next to continue.
Click Next to continue on the Explicit Group Members page.
Click Create/Edit rules on the Dynamic Inclusion Rules page.
Select the class Windows Computer and click the Add button.
Select Active Directory Site, Equals, and the AD Site Name (for example, Active Directory Site, Equals, Plano). Click OK.
Note: Custom Computer Group Options for the Windows Computer
The Windows Computer group type has properties that include Principal Name, DNS Name, NetBIOS Computer Name, NetBIOS Domain Name, IP Address, Network Name, Active Directory SID, Virtual Machine, DNS Domain Name, Organizational Unit, DNS Forest Name, Active Directory Site, and Display Name. Therefore, a lot of cool things can be done here, such as easily identifying systems in Organizational Units (OUs), forests, and even virtual machines!
Click Next to continue on the Dynamic Inclusion Rules page.
Click Next to continue on the Add Subgroups page.
Click Create to create the computer group on the Exclude Objects from this group page.
In the Monitoring space, you can now use the Scope button to choose the new group. In addition, if you prefix the computer names with the name of your company (in our example, we would start the computer group names with Odyssey), you can easily choose from the list of AD sites.

Computer groups are extremely flexible within OpsMgr 2007 and are important in an adaptable environment. The fact that OpsMgr groups can automatically identify their appropriate management packs brings OpsMgr another step closer to meeting the requirements of a changing information technology environment.

Tip: Populate Dynamic Computer Groups with Health Service Watchers

The Watchanator is a management pack developed by Timothy McFadden to populate dynamic computer groups with the associated Health Service (Agent) watchers. This is useful because you would then receive heartbeat (up/down) alerts for your systems!

You can download the Watchanator from http://timothymcfadden.googlepages.com/Watchanator2.zip. Timothy’s write-up is at http://scom2k7.blogspot.com/2007/10/watchanator-20-heart-beat-alert-tool.html.

Console Tasks

OpsMgr uses console tasks to automate the process of performing commonly executed functions from the Operations console. Tasks make it easier to complete these functions, and they also provide us with a basis to extend the capabilities available with the diagnostic and recovery tasks we discussed in the “Diagnostics and Recoveries” section earlier in this chapter.

Console tasks exist within management packs and are context specific—meaning that they are only available when the object you are working with allows that particular console task to occur. As an example, if you highlight an Active Directory–related alert, you would not want to see SharePoint-related console tasks. A number of console tasks are delivered within the management packs from Microsoft, and you can use them to automate various functions. You can create tasks to run a command line or a script, running them from the console (alternatively, you can run agent tasks on a managed system).

Enhancing Diagnostics and Recoveries

At first glance, console tasks would appear to be irrelevant to automatically adapting an environment, because they require activation by someone using the Operations console. Yet while the console tasks themselves require interaction, they can provide functionality to incorporate into diagnostics and recoveries.

There is no easy way to include an existing console task in a diagnostic or recovery task (because you are not able to choose console tasks or scripts from the drop-down box when creating a recovery task). However, the commands and scripts used by console tasks can easily be re-created using Windows Notepad and then copied and pasted between the console task and the diagnostic or recovery task.

Figure 20.13 shows a script used by a console task (the List Top Processes on DC task in this case). You can transfer the script from the console task to a diagnostic or recovery task by copying the script from the console task, pasting it into Notepad temporarily, and then copying and pasting it from Notepad into the diagnostic or recovery task.

Figure 20.13. Top CPU script within the List Top Processes script.

Transferring command-line tasks is a little more difficult because the configuration for the task displays in XML. However, the important information is easy enough to interpret. Figure 20.14 shows the configuration from the Repadmin /replsum task.

Figure 20.14. Configuration for the Repadmin /replsum task.

In Figure 20.15, we show creating a diagnostic task with the following information from the Repadmin command-line task added to it:

ApplicationName provides the name of the command line to run.
SupportToolsInstallDir is the path to the executable.
CommandLine provides the information for the parameters.
TimeoutSeconds provides the value for the timeout value.

Figure 20.15. Creating a diagnostic from the repadmin task.

The information in the box below the timeout is the information copied from the script in Figure 20.14.

Console tasks provide a wealth of scripts and command-line options that increase the capabilities of diagnostics and recoveries. There are not many predefined diagnostics and recovery tasks, but there are quite a few predefined console tasks to use for providing additional functionality with diagnostics and recoveries.

Identifying Conditions

OpsMgr has the capability to identify conditions based on a variety of different sources of information, including the Windows Event log, generic log files, Windows Services, Windows Management Instrumentation (WMI), performance counters, and Simple Network Management Protocol (SNMP) events (see Chapter 14 for additional information). After identifying a condition, OpsMgr can perform an action using the notification and/or diagnostic and recovery functions, which provide a method to react to the condition. As we discussed, notifications, diagnostics, and recoveries all provide the ability to run a script (among other actions).

All of this reduces to a core concept: Can you identify it, and can you script it? If so, you can do it using OpsMgr. This concept is the foundation we will build on for the remainder of this chapter.

Tip: On the CD

Microsoft’s TechNet magazine includes a column where the “Scripting Guys” address commonly asked questions about system administration scripting. These questions are also available on the Microsoft website, at http://www.microsoft.com/technet/scriptcenter/resources/qanda/default.mspx. The Scripting Guys have created a downloadable archive of these scripts from August 2004 to June 2006 in CHM format. For your convenience, we include this archive on the CD accompanying this book.

Operations Manager Integration

Although Operations Manager is an important piece of the puzzle in automatically adapting to a changing environment, you can significantly increase its capabilities by interacting with products both within the System Center family and with other Microsoft products. In the next sections, we will discuss a variety of products that can work together with OpsMgr to provide a foundation for automatically adapting to changes.

Configuration Manager

System Center Configuration Manager (formerly named Microsoft Systems Management Server, or SMS) provides hardware and software inventory, software distribution, patch management, remote control, software metering, Operating System Deployment (OSD), and device management for Windows-based environments. There is built-in integration with Operations Manager. This integration includes

Putting a monitored system into maintenance mode prior to rebooting it with ConfigMgr through software deployment
Generating alerts on application (or OSD) deployments
Deploying agents using ConfigMgr

This section focuses on ways to increase the functionality or automation of OpsMgr through integration with ConfigMgr.

Using Collections

ConfigMgr uses collections to target software or operating systems for deployment. You define collections using either static or dynamic membership (similar to defining computer groups in OpsMgr 2007). As we discussed in Chapter 9, “Installing and Configuring Agents,” you can use ConfigMgr to deploy the Operations Manager 2007 agent. However, this is only the beginning of how these two products can complement each other.

Using ConfigMgr, you can create collections that will identify systems requiring service packs prior to deploying the Operations Manager agent (the same collections can also be the target of the service pack and deployed via ConfigMgr/SMS). As an example, the following query identifies systems running Windows Server 2003 that do not have Service Pack 1 (SP 1) applied. We are interested in this query because SP 1 is a prerequisite to installing the OpsMgr agent on Windows Server 2003 systems.

select SMS_R_System.Name, o.CSDVersion, o.Version from SMS_R_System inner join
SMS_G_System_OPERATING_SYSTEM as o on o.ResourceID = SMS_R_System.ResourceId
where o.Version = "5.2.3790" and o.CSDVersion not in ( "Service Pack 1",
"Service Pack 2" )

At first glance, this query looks pretty complex, but the query was created using the ConfigMgr graphical user interface (GUI), and it’s just shown here to provide a quick way to reuse any queries that are created. The following query identifies systems running Windows Server 2000 that do not have SP 4 applied, a requirement for installing the OpsMgr agent on Windows 2000 systems:

select SMS_R_System.Name, SMS_R_System.NetbiosName, o.CSDVersion, o.Version from
SMS_R_System inner join SMS_G_System_OPERATING_SYSTEM as o on o.ResourceID =
SMS_R_System.ResourceId where o.Version = "5.0.2195" and o.CSDVersion not in
( "Service Pack 4" )

Another example of where the two products complement each other is creating a collection to identify the systems with the ConfigMgr agent deployed. You could then target this collection to deploy additional support tools used by the management packs. Here’s an example of a query defining a collection that identifies systems with the OpsMgr agent:

select Name from SMS_R_System where ResourceId in (select SMS_R_System.ResourceId
from SMS_R_System inner join SMS_G_System_ADD_REMOVE_PROGRAMS on
SMS_G_System_ADD_REMOVE_PROGRAMS.ResourceID = SMS_R_System.ResourceId where
SMS_G_System_ADD_REMOVE_PROGRAMS.DisplayName in ( "System Center Operations Manager
2007 Agent")) order by Name

Tip: On the CD

The queries shown here are on the CD accompanying the book, and you can copy/paste them into your ConfigMgr environment.

This query states that to be a member of the System Center Operations Manager 2007 Agent collection, the agent application must be listed in the Control Panel Add/Remove programs applet. After defining this collection, you can configure a package to push out to that collection that will automate installing additional support tools for the system.

ConfigMgr, coupled with OpsMgr’s Active Directory Integration capability (see Chapter 9), lets you automatically deploy OpsMgr agents and then automate additional supplemental software deployments.

Using ConfigMgr to Install the Operations Console

You can use ConfigMgr to automate deploying the Operations console, perhaps to a collection of OpsMgr administrator workstations. The installation requires creating four packages (the package installing the Operations console requires that the other three are installed first).

The first package installs .NET Framework 2.0 with this command:
<LINELENGTH>65</LINELENGTH>
```
Microsoft.NET Framework 2.0.exe /Q
```
The second installs .NET Framework 3.0:
<LINELENGTH>65</LINELENGTH>
```
dotnetfx3.exe /q
```
The third installs PowerShell:
<LINELENGTH>65</LINELENGTH>
```
powershell2003sp1.exe /quiet
```
The fourth package installs the Operations console only (substitute RMSServer with your environment’s RMS server).
<LINELENGTH>65</LINELENGTH>
```
%WinDir%System32msiexec.exe /i <path>MOM.msi /qn /l*v
%Temp%MOMUI_install.log ADDLOCAL=MOMUI
ROOT_MANAGEMENT_SERVER_DNS=<RMSServer>
```
Thanks to Tarek Ismail (http://tarek-online.blogspot.com/) for writing this one up!

Integrating with Agentless Exception Monitoring

Configuration Manager also integrates with the Agentless Exception Monitoring (AEM) capability of Operations Manager. AEM identifies applications that have crashes. You can use Configuration Manager to patch the application that is crashing, and then use AEM again to determine whether the patch had an impact on the number of crashes occurring within that application.

Extending Functionality with Scripting

Some of the functionality available within ConfigMgr is executable from the command line, and various scripts are available from locations such as www.myitforum.com and www.faqshop.com. We include scripts that provide the ability to perform a software or hardware inventory, or to add a computer to a static collection. These scripts are on the CD accompanying this book.

The synergy between the OpsMgr and ConfigMgr products provides new capabilities for Operations Manager 2007 that are of use for automatically adapting with OpsMgr.

Capacity Planner

System Center Capacity Planner (SCCP) provides the ability to develop architectures for Exchange 2003/2007 and eventually OpsMgr 2007 to test various “what if” scenarios and to identify potential bottlenecks. Although the new version of SCCP 2007 does not currently support OpsMgr 2007, we expect it to provide similar functionality for OpsMgr 2007 in 2008. SCCP is currently a standalone product, but it has the potential for integration with OpsMgr 2007.

SCCP asks a variety of questions that assist with describing a model for a recommended architecture. When Exchange is being modeled, for example, the information gathered includes the number of sites and Outlook client usage, network connectivity between sites, and potential hardware configurations.

Looking at the information SCCP uses for its capacity model, you may notice that the majority of this data could be drawn from existing sources. Drawing from these sources would automate the integration of the Capacity Planner. Examples include the following:

Site information from Active Directory sites and services.
Hardware configurations from ConfigMgr hardware inventory.
Outlook and client usage from ConfigMgr software metering.
Network connectivity information may be available from OpsMgr; the product includes integration with network devices, and that functionality is enhanced in the next version.

Why does it matter that we already have this information in the System Center product line? Using information already available gives us the ability to proactively plan changes (such adding sites to the network), changes in the usage of Exchange, and changes in user counts. Let’s work with this idea, using Exchange as an example.

Your boss walks into your office and tells you that your company is thinking about acquiring another company (hey, it would be nice if they actually told you that in advance for a change). He wants you to migrate the new user base into your current Exchange environment (the other company has 1000 users with 400GB of mail that will need to be transferred if the company is acquired). With an integrated OpsMgr/ConfigMgr/SCCP, you could build a projection that would indicate potential bottlenecks from adding the users into the current environment, as well as display any upgrades required to make it possible.

We can take another more common Exchange scenario. Your mailbox store is increasing at a rate of 30GB per month. You use OpsMgr to run a scheduled SCCP plan based on current growth rates, getting bottleneck information in advance. This allows you to make changes and resolve issues before they occur.

The further integration of the System Center product line, including SCCP with OpsMgr, holds some exciting potential capabilities that will work toward providing a more proactive monitoring and management solution.

Virtual Machine Manager and Virtual Server 2005 R2

The System Center Virtual Machine Manager (SCVMM) provides a centralized product for deploying and managing virtual machines in a virtual server environment. It includes the ability to determine servers that are good candidates for virtualization, and it provides an easy-to-use Physical to Virtual (PtoV) conversion process. SCVMM also provides a self-service provisioning capability and the ability to perform automated processes with PowerShell. Figure 20.16 shows the SCVMM interface, running on a host system called Hurricane with a guest operating system named Honor.

Figure 20.16. The System Center Virtual Machine Manager interface.

The user interface on SCVMM is very similar to the OpsMgr 2007 Operations console. The left side shows the different host systems available, with different views for virtual machine statuses such as running, paused, and saved. The bottom-left panel provides a list of options, including the Hosts, Virtual Machines (shown), Library, Jobs, and Administration. The Actions pane shows a variety of items that perform tasks such as creating a new virtual machine, adding a library server, adding hosts, and changing the state of a guest operating system (Start, Stop, Pause, Save State, Discard saved state, Shut down).

Tip: SCVMM Scripting Guide

Microsoft has released a scripting guide for SCVMM, which contains nearly 150 pages of information (including sample scripts) on how to script for System Center Virtual Machine Manager 2007. The scripting guide is available for download at http://go.microsoft.com/fwlink/?LinkId=104290.

You can also clone virtual machines and store them in a library for reuse. SCVMM provides some capabilities that greatly enhance the functionality currently available in Virtual Server R2. There are a lot of very exciting possibilities for automatically adapting your environment, which we will discuss later in this chapter in the “Automatically Adapting with Operations Manager” section.

PowerShell/Command Shell

In our opinion, PowerShell is one of the more exciting concepts to come out of Microsoft in a long time. One of the true strengths of UNIX-based systems is the ability to script just about anything, automating any process. Whereas UNIX started as a command-line user interface, Windows began as a graphical user interface. Although a GUI is much more intuitive and oftentimes more easy to use, it can lack the power of a command-line environment, where you can script and automate processes. Microsoft designed PowerShell (previously called Monad) to provide the best of both worlds, bringing an extremely powerful command-line interface to the Windows platform. The Operations Manager Command Shell is the OpsMgr-specific version of the PowerShell interface.

PowerShell is integrated into several Microsoft products, including Exchange 2007, OpsMgr 2007, System Center Data Protection Manager V2, SCVMM, and all versions of Windows Server 2008 except for the Server Core. There are even extensions to add PowerShell functionality into non-PowerShell-integrated products such as SMS 2003 (for details see http://www.microsoft.com/technet/technetmag/issues/2007/11/UtilitySpotlight/, which discusses adding the management of SMS clients from the command line).

Tip: PowerShell cmdlets and Additional Functionality

PowerShell is really catching on; even third-party organizations are providing free downloadable cmdlets. As an example, Quest Software provides a free set of PowerShell commands for Active Directory, available at http://www.quest.com/activeroles-server/arms.aspx. PowerGUI, an extensible console based on PowerShell, is also available for download at http://www.powergui.org/downloads.jspa.

The PowerShell syntax is straightforward and works using the following format:

Verb-Noun Parameter

As an example, to get the status of a service, we would use get-service and then the name of the service. The syntax to get the status for the OpsMgr HealthService would be

get-service HealthService

The results of this command present the status of the service as shown next (shown when the service is running):

Status    Name            DisplayName
Running   HealthService   OpsMgr Health Service

You can run get-service HealthService | fl for a more detailed description of the HealthService. Using this syntax gives us the following output:

Name                : HealthService
DisplayName         : OpsMgr Health Service
Status              : Running
DependentServices   : {}
ServicesDependedOn  : {rpcss}
CanPauseAndContinue : True
CanShutdown         : True
CanStop             : True
ServiceType         : Win32ShareProcess

PowerShell for Beginners

When we were starting out with PowerShell, we wanted a quick way to find out what custom-built collections existed within an SMS site and to save that information to a file for review. We started by installing PowerShell and checking into what functionality was available. (PowerShell requires installation of .NET 2.0).

Not knowing what command (known as a cmdlet in PowerShell) we needed, we opened PowerShell (available on the Start menu, in the Windows PowerShell 1.0 folder -> Windows PowerShell). We started, logically enough, with the following command:

help

Help provided us with a list of all cmdlets available. There were quite a few, so we did the following:

help *wmi*

This gave us a list limited to the cmdlets with wmi in them. This led us to find out that get-wmiobject likely was what we were looking for. To get additional information on this cmdlet, we typed the following:

help get-wmiobject -detailed
help get-wmiobject -full

The help information provided us with what we needed, because we already had the query to use with WMI. Here’s the resultant command:

get-wmiobject -query "select * from sms_collection where collectionid like
'DAL%'" -namespace rootsmssite_DAL-computername monarch >
collections.txt

This resulted in creating a collections.txt file with the detail for custom collections available within our SMS site (which is DAL, and for the computer we specified “monarch”).

Additional documentation and versions of PowerShell for Windows XP and Windows 2003 Server (and 2003 x64) are available for download if you go to http://download.microsoft.com and search on “PowerShell.” Microsoft also has developed webcasts on PowerShell, available at http://www.microsoft.com/technet/scriptcenter/webcasts/ps.mspx. In addition, we provide a PowerShell Cheat Sheet created by Microsoft, available on the CD accompanying this book.

There is also a PowerShell cmdlet named Start-Transcript. You can use this to record your PowerShell session in a text file. This is useful to help remember what you did!

From an OpsMgr functionality perspective, we anticipate PowerShell will be indispensable, because it will provide command-line control of tasks previously performed with the GUI. This also provides direct benefits to OpsMgr, because new functions should be available to use as tasks or diagnostics and recoveries through integration with PowerShell.

Service Manager

Although the System Center Service Manager (ServiceMgr) product is currently in beta, the functionality currently planned integrates with Active Directory, Operations Manager, and SMS/ConfigMgr. ServiceMgr is a new product in the System Center product line, designed to provide incident management, change management, knowledge sharing, and self-service provisioning capabilities. ServiceMgr builds on multiple technologies, including Windows 2003 SP 1, the .NET 2.0 and 3.0 Frameworks, SQL 2005 SP 2 with Reporting Services, and SharePoint 2007.

ServiceMgr uses connectors that integrate information from other sources:

The Active Directory connector gathers user information into ServiceMgr, so you do not need to configure the same information in both locations.
The connector for SMS/ConfigMgr integrates the hardware and software asset information available within that product into ServiceMgr.
The connector for OpsMgr provides the ability to create incidents from alerts or to view Operations Manager knowledge.

ServiceMgr also includes a self-service portal function, which you can use to file incidents and request software for deployment. The software request functionality includes the ability to provide a workflow and approval for the requests. After a request is approved, SMS/ConfigMgr deploys the software to the resource specified by the user.

We believe the direction for Service Manager is one that will continue to integrate the System Center product line and leverage benefits from the existing products. We will discuss implications of the self-service portal function within the “Automatically Adapting with Operations Manager” section of this chapter.

In addition to the System Center family, other products outside of the System Center product line can also be integrated with OpsMgr, which we will look at in the next sections of this chapter. These products provide pieces of functionality needed to create an environment that automatically adapts.

SharePoint

In Microsoft Operations Manager (MOM) 2005, Microsoft provided a SharePoint 2003 WebPart that integrated the state of computers monitored by MOM 2005 into a SharePoint site. This functionality provided a dashboard-level view of the state of the MOM environment for non-MOM administrators.

Note: MOM 2005 SP 1 and the SharePoint WebPart

With the release of MOM 2005 SP 1, the MOM 2005 SharePoint WebPart stopped working correctly. The first issue we ran into was getting it installed. The installation failed unless we included the correct SharePoint Virtual Server information, which we show next. The virtual server’s physical location defaulted correctly, but the SharePoint virtual server defaulted incorrectly.

SharePoint Virtual Server (including port)
http://monarch:81
Virtual Server Physical Location:
C:inetpubwwwroot

Next, we had to change the config.web file for SharePoint (for us located under C:InetpubSharePoint) from

   <trust level="Full" originUrl="" />

   <trust level="WSS_Minimal" originUrl="" />

And then we had to perform an IISreset.

Now we could add the WebPart to the SharePoint (or WSS) site, but it would error out when we attempted to configure it by clicking the Show MOM 2005 Property ToolPane button. With significant help from Microsoft, we made the change detailed below.

Edit the config.web file for SharePoint (for us located under C:InetpubSharePoint) to add the following information just after the </configSections> tag:

<runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
     <dependentAssembly>
       <assemblyIdentity name="microsoft.web.ui.webcontrols"
                         publicKeyToken="31BF3856AD364E35"
                         culture="neutral" />
       <!-- Redirecting to version 5.0.2749.5 of the assembly. -->
       <bindingRedirect oldVersion="5.0.2749.0"
                        newVersion="5.0.2911.0"/>
     </dependentAssembly>
    </assemblyBinding>
</runtime>

Next, we did an IISreset. Finally, we had to copy the microsoft.web.ui.webcontrols.dll file to the SharePoint bin directory (for us it was C:inetpubSharePointin). After this, we were able to configure the WebPart.

Although there is not currently a WebPart available for OpsMgr 2007, the existence of a WebPart for the MOM 2005 version implies that the same functionality could exist with OpsMgr 2007.

The same concepts that applied to the previous versions of Microsoft Operations Manager (MOM) and SharePoint should apply to the current versions. WebParts can be developed to integrate the two products, providing server status information for the servers monitored with OpsMgr. With OpsMgr, the potential capabilities of integration between these two products may be even more exciting. Instead of only reporting on the state of the computers monitored by OpsMgr, you can also report the state of a distributed application. OpsMgr provides the capabilities to schedule reports and to publish them. You can then link to these reports through SharePoint.

SharePoint could also provide a storage area for items that are not scheduled reports but provide good general information to have available on a website. One could publish information gathered from various sources such as console tasks, diagnostics, and recoveries. You could send that information to the SharePoint site, providing a centralized repository for non-OpsMgr administrators to check on the state of recent technical tasks performed. As an example, if your organization created a script to report periodically on the status of key pieces of the infrastructure (dcdiag, netdiag, replmon, netdom query fsmo, and so on), you could publish those results to SharePoint.

SharePoint provides both the potential for a dashboard view of the environment as well as a web-based repository of information, which is not available through OpsMgr’s Reporting services.

Exchange 2007

Microsoft Exchange 2007’s integration with PowerShell is a significant change when compared with Exchange 2003. This integration significantly increases the potential of scripting actions that previously were not possible. As we increase what we can script, we also increase our options for automating processes.

Exchange 2003 incorporated two roles: frontend server and backend server. In Exchange 2007, there are now five roles: Edge Transport, Hub Transport, Mailbox, Client Access, and Unified Messaging.

The Edge Transport is installed on a standalone server on the edge of the network and provides functions such as antivirus and antispam protection. This is the only role that can’t be installed with other roles.
The Hub Transport provides functions similar to message-routing functionality.
The Mailbox server holds mailbox data.
The Client Access server is the connection point for Outlook clients (among others), ActiveSync, and Outlook Web Access.
The Unified Messaging server merges VoIP infrastructures with the Exchange organization.

Mass-creating User Accounts in Exchange 2007/Active Directory

One of our clients had a requirement to create a large number of email-enabled users. Although they did not have current accounts to use for a migration, they provided a phone list we could use to generate the user information. Because the client required Exchange 2007 accounts, we decided to use PowerShell to create the accounts. There is plenty of good documentation on how to use the new-mailbox cmdlet, but we are including this information here to show how Excel can be used to format data that you can save to a .csv format and then integrate into a cmdlet.

To create the user accounts, we took the phone list and formatted it into the structure shown in Table 20.1 with Excel (to simplify the import process, we named each of the fields with the same naming conventions used to import mailboxes with the new-mailbox cmdlet).

Next, we used the following PowerShell script to add the users:

import-csv create_users.csv | foreach {new-mailbox -alias $_.alias -name
$_.name -userPrincipalName $_.UserPrincipalName -database $_.database
-organizationalunit $_.organizationalunit -firstname $_.firstname-lastname
$_.lastname }

After the users were added and replication occurred, the user accounts appeared within the Exchange Management console. There was only one downside with this approach; we had to type the password for each account.

Table 20.1. User Account Fields

Alias	Name	UserPrincipalName	OrganizationalUnit	Database	FirstName	LastName
John.Smith	Smith, John	[email protected]	Import	First Storage GroupMailbox Database	John	Smith

Microsoft released the Exchange 2007 management pack for OpsMgr 2007 in October 2007. Based on the Exchange 2003 management pack, we anticipated the Exchange 2007 management pack would provide the following information:

Up/down status of the Exchange services.
Dynamic thresholds on common performance counters.
Mail-flow testing between Exchange servers.
Reporting on common items (such as top mailbox sizes, SMTP usage).
Identifying best-practice antivirus configurations. (Exclude Exchange programs, databases, and log files from antivirus scanning because scanning is a common cause of data corruption.)
Synthetic transactions to test Outlook Web Access and ActiveSync.

The area that is exciting to consider here is how well the new PowerShell functionality will integrate with the Exchange 2007 management pack. Based on the current Exchange 2003 management pack, common tasks would include the stopping/restarting of Exchange-related services, querying of queue state, and executing the best practices analyzer functions.

There are many interesting possibilities for tasks or features of the management pack, such as the following:

Performing configuration changes identified by the Exchange Best Practices Analyzer (a Do-IT button).
Mailbox migration between Exchange servers.
Execution of a script to fix alias names that include spaces. (Exchange 2007 does not allow spaces or special characters in the alias names field.)
Automating load balancing between Exchange servers using PowerShell scripts, which would migrate mailboxes between the servers and keep track of the changes through alerts. (For client mailbox recovery, you need to know where the mailbox was located.)
Testing to validate the certificates on servers to verify that they exist in the trusted store, the date is valid, and when they will expire. You can pull the cert information using PowerShell with the following command on the Hub Transport server:
<LINELENGTH>90</LINELENGTH>
```
Get-ExchangeCertificate | fl
```
Using SMS/ConfigMgr to gather an inventory of PST information (users’ personal mail folders) on the network and integrate it into reports, which will assist with capacity planning for Exchange and potential archive solutions.
A health check for the existence of Recovery Storage Groups (RSG). The RSG should be deleted if it is not currently in use.

The Exchange 2007 management pack has the potential to provide a significant increase in functionality within OpsMgr, especially with the integration of PowerShell into the core functionality of Exchange 2007.

Active Directory

Active Directory maintains and provides information such as users, groups, passwords, security, and other information required within a directory service. As we discussed in the “Exchange 2007” section of this chapter when we talked about creating user accounts in Exchange 2007/Active Directory, we can see the level of integration between those two products.

The Active Directory management pack enables OpsMgr to gather information on Active Directory to identify issues in your domain environment. Looking at the integration between the System Center family of products leads to another series of potential capabilities for making an environment more capable of being automatically adapted.

If the System Center Capacity Planner can be integrated with OpsMgr (see the “Capacity Planner” section of this chapter), and if Active Directory can be designed using a future version of SCCP, it introduces the ability to make capacity decisions based on how the current environment is designed.

This type of functionality has significant potential implications. As an example, if remote users are continuously experiencing long delays logging in to the environment, you could run SCCP to simulate a domain controller installed at the remote user location, thus validating whether it would be beneficial to add a domain controller at that location.

VMWare MP

Multiple vendors, including eXc (http://www.excsoftware.com/version3/version3/Products.aspx), Jalasoft (http://www.jalasoft.com/jalasoftweb/jsp/products/xianio/), and nworks (http://www.nworks.com/vmware/index.php), offer management packs providing monitoring capabilities for the VMWare product line. The ability to monitor VMWare in a capacity similar to the Virtual Server and SCVMM products opens up the potential to provide a dynamically changing virtual environment that will adjust based on changes in the environment, similar to those discussed within the “Virtual Machine Manager and Virtual Server 2005 R2” section of this chapter.

Custom Management Packs

For the purposes of building the foundation for an environment that can automatically adapt, an important insight is developing custom management packs, which allows us to add functions not currently available in OpsMgr. In Chapter 23, “Developing Management Packs and Reports,” we discuss the process used to create custom management packs for OpsMgr 2007.

Some examples of adding functionality would include management packs with capabilities such as the following:

Notifying the help desk when a user account locks out (so they can contact the user instead of the user contacting the help desk)
Monitoring for files placed in a particular folder and performing a specified action when they are found
Providing database sizing trending information for the OpsMgr databases in your environment

Each of these different technologies provides a piece of the landscape required to develop an environment where OpsMgr 2007 can automatically adapt to changes. In the next section, we will discuss a variety of possible concepts and scenarios for applying these technologies with Operations Manager 2007.

Automatically Adapting with Operations Manager

We have discussed the inherent capabilities of OpsMgr 2007 and ways to integrate it with other products to increase its functionality. Using these concepts, we created a foundation we can build on for OpsMgr (combined with other products) to automatically adapt to changes in your environment. This section of the chapter provides a vision of how you could configure systems to automatically adapt to changing conditions.

Many organizations prefer an environment where manual alterations are performed, because this minimizes the number of changes, the potential for unexpected changes, and the potential for errors resulting from automated changes. The common response here against automating the responses to particular errors is, “What if something goes wrong?”

In numerous situations, automatically adapting an environment is extremely useful. Examples include changes that are time critical and cannot wait on manual intervention, commonly executed tasks, and tasks that are often overlooked. We will review several of these scenarios, in addition to concepts to consider about how you might automatically change systems based on circumstances as they occur.

Maintaining Systems

When life as an administrator gets really busy, the first thing typically dropped is standard maintenance procedures. Administrators generally focus on the higher priority issues that occur, and do not have the time to spare to work on lower priorities. This is an area where OpsMgr automation can assist us.

Operations Manager 2007 gathers information on systems, including when they have high and low utilization. This puts Operations Manager 2007 in a unique situation—it has the information required to determine when to schedule maintenance based on when it would have the least impact on the system.

Using Operations Manager, we can create a script that executes daily, weekly, or monthly, performing checks of commonly missed maintenance tasks such as defragmentation, patch management, antivirus updates, and even determining whether files on the system have been backed up within a specified period of time. This script is assigned to a computer group. This can be an existing computer group, or we can create our own maintenance computer group.

The script tests for each condition. If there is an action that needs to be performed, it writes an event to the event log on the local system. We configure OpsMgr with a rule that checks for each condition and acts accordingly.

Disk Defragmentation

Let’s walk through an example of this. We create a computer group called Maintenance in OpsMgr that has a timed rule that executes a script on the target computers on a weekly basis. The script checks the level of disk fragmentation; if it is determined necessary to “defrag” the disk (defined as higher than 30% total fragmentation), the script writes an event to the Windows Application log on that system that includes specific information:

A source of OpsMgr Maintenance
An event number of 1001
An event category that indicates the drive letter that is heavily fragmented

If multiple drives are fragmented, there will be multiple events. If the drive is not fragmented, the script returns a value of 1000.

We create a monitor that has a state for Logical Disk Fragmentation that is in a healthy state if the event number is 1000 and warning state if it is 1001. We define a recovery for the warning state, running a command to perform a defragmentation on the system and reset the monitor state.

At the weekly timed interval, the members of the computer group (in this case, Hurricane) run the maintenance script. The script detects that the C: drive is highly fragmented, and it writes event 1001 to the Windows NT Application log from the OpsMgr Maintenance source, with an event category of C (C being the drive letter that is highly fragmented). The monitor finds event 1001 and it changes to a warning state, causing the recovery to fire. The recovery runs the defragmentation process on the system for the drive specified and resets the monitor to a healthy state.

As an interesting positive for this type of approach to system maintenance, OpsMgr could also place the system into maintenance mode during the time when any or all of the maintenance tasks were performed. (We discuss a script to automate maintenance mode in Chapter 8, “Configuring and Using Operations Manager 2007.”)

Patch Management

Patch management is another aspect of maintenance often overlooked. Organizations can either deploy a product to manage patches in their environment (such as Configuration Manager, WSUS [Windows Server Update Services], or third-party vendor software), or they can manually patch their systems. Manually patching is very labor intensive but quite common.

MOM 2005 included a Baseline Security Analyzer management pack that would check for a variety of conditions, including the current patch state. Currently there is no equivalent functionality available for OpsMgr 2007.

If we look into how we would integrate checking patch management status into our maintenance script, it might look like this:

When the maintenance script runs, the script would check the current state of patching. If it is out of compliance, the script writes the event 1011 from the OpsMgr Maintenance source (and if it is in compliance, event 1010 is written).
There would be a monitor to check for this, which would have a healthy state for the 1010 condition and a warning state for 1011.
A recovery would either call a script to patch the system or add the system to the patch management collection (see the script AddComputerToCollection.vbs [included on the CD with this book] for an example).
If there is no automated method available to patch the system, we could configure an alert to notify the appropriate personnel that the system is not being patched.
The alert could be integrated with a workflow that asks for approval from an administrator (or user) and then causes the system to be patched during the next maintenance window.

Disk Backups

Backups are another aspect of maintenance to consider as part of a maintenance script. If it were discovered the drive has not been backed up within a configured threshold (30 days, for example), an event would write for a warning state; if the drive was backed up within that period, an event would write for the healthy state. Based on the events, another monitor would be configured with a recovery defined to run a backup on the system and to store the data out to a defined network share. This network share would need to have a large amount of storage available. You would also want to monitor it (with OpsMgr!) to prevent it from filling up completely.

Tip: Backing Up Running VMs

You can even integrate backup automation with running virtual machines. Redmondmag.com provides a script that backs up Virtual Server 2005 SP 1 virtual machines while they are running, using the Volume Shadow Copy Service (VSS). Information on the script and the article are available at http://redmondmag.com/columns/print.asp?EditorialsID=2324.

Antivirus

Antivirus configurations are generally handled through scheduled updates, but the same concepts discussed for defragmentation, patch management, and backups apply to antivirus software as well. If a system were determined not current in its antivirus data files, a script to update the data file would run as part of a recovery.

Additional Maintenance Functions

As part of the same set of monitors, you can apply the same concepts to additional maintenance functions. As another example of this, in the “Diagnostics and Recoveries” section of this chapter, we discussed a recovery that would run a script to free up disk space by either removing unnecessary files or relocating them to network temporary storage.

The ability to automate routine maintenance tasks such as defragmentation, patch management, backups, antivirus, and drive space checks within Operations Manager 2007 provides a method to gain additional benefits from deploying OpsMgr, thus freeing up administrators to focus on other aspects of the environment. (Remember the 10 reasons for deploying Operations Manager discussed in Chapter 1, “Operations Management Basics”?)

It is important to provide a log of what automatic changes occur from your OpsMgr automation. A log provides both the information required to roll back from a change, if required, and a way to report on what OpsMgr has accomplished (for example, in the month of December how many systems were defragmented, how many disks were backed up, and so on).

Distributed Application Provisioning

As we discussed in previous chapters, OpsMgr focuses on health, and as a major part of this it includes the ability to monitor distributed applications (such as Active Directory, Exchange, custom-built applications, and even Operations Manager itself). As part of automatically adapting with OpsMgr 2007, we will discuss several options available for distributed applications to adapt to changes detected in the environment.

The Operations Manager Management Group

We will start with the concept of OpsMgr as a distributed application. OpsMgr does many things very well, but one thing it does not do well is correlate events. As an example, within our Odyssey organization we have a site in Plano and a site in Carrollton. We have servers in both locations, monitored by a centralized OpsMgr environment located in Plano. We monitor the site links with a TCP port test so we can identify when the link goes down. When the link goes down, we receive our alert from the TCP port test but we also receive health alerts that each server on the other side of the link is down as well. When the link comes back online, we receive a set of alerts from the remote servers indicating that they had problems while the link was down.

Although this is logical, it really is not that helpful. We don’t really care if we can’t talk to the servers or that the result is a lack of connectivity. We care that the link is down because it affects how (or if) servers and workstations can communicate with each other, and we understand the ramifications of that situation. So maybe we could adapt to the loss of the link within OpsMgr?

As an example, if the TCP port test fails, OpsMgr could put all remote servers into maintenance mode, close the alerts found by the servers in the remote site, and alert that the link is down. Then, when the link comes back up, it could bring the servers back out of maintenance mode. We like this approach because we now receive only the alert telling us that the link is down (the root cause), not the various side effects that resulted from the link going down. This is really an awesome capability—it prevents generating alerts based on one event causing a snowball of other events!

Custom Distributed Applications

Custom developed applications also need to be able to adapt to changes in the environment. For Odyssey, we have a distributed application called OdysseySimpleApp. A recovery can perform an action such as restarting the application, running a script, or restarting a service. Our custom application uses a Windows service that often stops responding. To resolve the issue, we restart the service, which we can now automate via a recovery.

Distributed applications can often have a tie into the world of Service Oriented Architectures (SOA). SOA, looked at from a high level, isolates the core business functions into independent services. These services work like functions that are called.

As an example of this, we can look at the OdysseyWeb application. OdysseyWeb is a web-based application that runs on multiple IIS 6.0 servers for the frontend, running in a load-balanced configuration (WebServer1 through WebServer50). They communicate to a service that runs on a series of load-balanced processing systems that perform the required transactions (Processing1 through Processing50). The data required for the application is stored in a SQL 2005 cluster (Data1 through Data4). From an SOA perspective, we have a frontend service, a processing service, and database services.

For monitoring distributed applications, we perform tests at each level of the application to validate functionality. Starting with the web services, we test for connectivity to each web server to validate that they are responding to port 80/443 calls. If they are not responding, a recovery task is fired, resetting the web-based application. On the processing systems, we check via a TCP port test and performance counters to validate that transactions are actively being accepted by the services. If the transactions are not being accepted, the service is restarted as part of a recovery task. On the database cluster, we perform an OLE DB Data Source check to validate that it is functional. If it is not, the cluster is moved to the second node and retested.

For an overall test of the application, we configure a synthetic test, using the Web Application Management Pack Template. If the synthetic test fails, we perform a recovery based on the error, provided within the test. With this example we see some of the range of tests and recovery tasks that can be accomplished with OpsMgr, specifically related to how distributed applications function. Later in the “Server Provisioning” section of this chapter, we will discuss server adaptation and addressing changes in application performance with changes to the servers themselves.

Workflow and Application Provisioning

Applications and their provisioning is another area where OpsMgr can adapt to changes. Although OpsMgr is not a workflow product, it can respond to events that occur in the monitored environment. As an example, we could create a SharePoint website to provide an area for users to request applications. As part of the workflow, the requests could be accepted or denied, based on manager approval. If a change was accepted to install an application for the user, an event can be written on the SharePoint server indicating the application and the workstation it should be deployed to.

OpsMgr can have a rule to check for this condition, which would then update the collection to include the new user or workstation (see the AddComputerToCollection.vbs script on the accompanying CD for an example of how to add computers to collections). Through this, OpsMgr provides a part of the workflow required to deploy (or potentially uninstall) an application in the environment.

If we consider the concept of application provisioning, by adding Service Manager we can now take the concepts above where OpsMgr provides the workflow and replace that with the capabilities of ServiceMgr, using the self-provisioning functionality discussed in the “Service Manager” section of this chapter. The self-provisioning capabilities of ServiceMgr offer a more seamless method to integrate the System Center product line and automatically adapt to changes such as when application deployments are required.

We could also automatically deprovision applications. SMS and ConfigMgr provide the ability to meter application usage. If a metered application is not used within a specific timeframe (such as 13 months), an event could be fired that causes SMS/ConfigMgr to remove the application from the system not using the application. Automated application deprovisioning provides an effective method to decrease the software licensing requirements of an organization, only allowing the application to be on systems where it is actually used. If we wanted a check-and-balance on application deprovisioning, OpsMgr could send a notification to the users, requesting they authorize that the application remain on their systems. If they approve its removal, SMS/ConfigMgr can deinstall the software; if they deny the approval, they would be notified again if the application were not used within a defined timeframe.

You could configure this deprovisioning as part of a scheduled process, where one month prior to reviewing existing product licenses (called TrueUP in Microsoft Enterprise Agreements), an automatic deprovision of applications would occur, based on applications that have not been used in the specified period of time. By running this a month before TrueUP, there is time for the users to determine whether they actually require the application and reprovision it to their systems. This approach provides a method for organizations to truly pay only for the software they are using.

User and Computer Provisioning

OpsMgr can also assist with the process of adding users to a domain or automating the process for deprovisioning user and computer accounts.

When you add a user account in the Active Directory domain, it generates an event that OpsMgr can respond to. When OpsMgr finds the event, it adds information to SharePoint, which would add the user automatically to a user provisioning workflow. Applications could also be provisioned to the user, as discussed in the “Distributed Application Provisioning” section of this chapter. A simpler interaction would be to create a help desk ticket when the user-creation event occurs. As we discussed in the “Service Manager” section of this chapter, Service Manager can automatically generate tickets based on alerts from OpsMgr.

We can also use OpsMgr to automatically deprovision unused users, computers, and computer accounts. Computer accounts that are inactive for a defined time period can be moved to an Active Directory Organizational Unit where they are stored for an additional retention period. If the computer accounts are still inactive at that time, they can be deleted.

These same concepts apply to user accounts. A user inactive for a defined time period can be moved to an OU where the account is stored for an additional retention period (you would want to identify a set of accounts not to disable or delete). If the user account is still inactive after the additional retention period, the user account is deleted. As an example, if user account Joe.Smith is not in use after 60 days, it is placed in the DISABLED OU. If the same user does not access his account for another 90 days, the account is automatically deleted from Active Directory.

The timings of when accounts are disabled or deleted could be customized to the requirements of the specific environment, and exclusions can be made for user or computer accounts that should not be subject to automatic deprovisioning. These processes can be automated, and we provide a set of scripts with this book. You would want to log all items and publish them on a SharePoint website, or send them to a manager for auditing purposes.

Tip: On the CD

Sample scripts for automatic deprovisioning user and computer accounts are included on the CD accompanying this book.

Security Adaptation

OpsMgr can also automatically adapt to changes that result from security modifications that may occur. With a strong auditing policy in place and Audit Collection Services (ACS) integrated (see Chapter 15, “Monitoring Audit Collection Services,” for details), Operations Manager is capable of immediately being aware of changes in the security environment. Once a change is detected, we can respond to the change in security using the concepts discussed throughout this chapter.

As an example of how OpsMgr can automatically adapt to a security change, we will take a situation where a user has attempted to log in to the network multiple times but mistyped his password and ended up locking out his user account. Normally, the user would call the help desk, and the help desk would work with him to unlock the account. Alternatively, OpsMgr could detect the account lockout and notify the help desk via email so that they could contact the user instead of the user contacting them! This allows the help desk to be more proactive in nature, which directly affects the user’s satisfaction in his experience with the help desk. If the user did not mistype his password (as in a case where someone else is attempting to access the account), the help desk can now identify this situation more quickly as well.

We can also proactively gather information that will assist in determining whether a user lockout is actually an attempt to breach security or is just a case of a mistyped password. If multiple failed logon attempts occur, we could configure a diagnostic to activate a NetMon capture to collect network information at the time of the user lockout. Reviewing the network information could generate additional information about the attempts to log in to the systems. A good example of creative uses of NetMon traces is available at http://marcusoh.blogspot.com/2007/07/os-capturing-netmon-traces-in-such.html, where a NetMon trace starts based on an event and stops based on a specified condition.

Security attacks or changes in security can occur very quickly, and OpsMgr can help by responding to the conditions more quickly than manual intervention can take place. Automatically adapting to security situations in real time is one of the key benefits to embracing this concept!

Server Provisioning

We saved the best for last here, because the potential benefits for server provisioning are among the most striking areas where OpsMgr can automatically adapt. Before we can discuss the OpsMgr capabilities from a server provisioning perspective, however, we need to review the concepts of scaling servers.

Scaling an application can provide the ability to increase the number of individuals who can use the application, or the ability to adapt to changes in business requirements. An application that can only run on a single server in a single-processor configuration is an example of an application that does not scale well. In a well-designed enterprise application, all components of the application need to support scaling to be able to provide continuous growth to meet user demand and business requirements. Scaling occurs either through vertical or horizontal scaling.

Vertical scaling (often referred to as scaling up) increases scalability through adding capacity to existing resources. Examples of this include adding more processors/memory/network adapters or faster disks. Vertical scaling depends on the application or service being able to leverage the new resources. As an example, a single-threaded application will not scale well to multiple processors.

Horizontal scaling (often referred to as scaling out) increases scalability through adding different systems that work together as one logical unit. A good example of horizontal scaling is web farms, which can have their scalability increased through the addition of more servers to the web farms.

Scaling Out

Let’s examine a situation where we have multiple web servers that provide a component of a web-based application. There are three available web servers, with each web server running as a guest operating system on a different host operating system. Normally, two of the web servers are sufficient to meet the requirements of the web piece of the application.

Suppose that OpsMgr detects a bottleneck both on the application and on the operating system level. From this, we can determine that the web servers in the environment are each reporting high processor levels, and the application response time is not within acceptable performance levels. OpsMgr performs a recovery based on this condition and activates the web server on the third system. This action splits the load between the three web servers instead of the two web servers. If the load on the websites then declines for a long period of time, the third web server is set to no longer accept connections by OpsMgr. Once there are no longer any remaining connections, that virtual server is put back into a saved state.

This process may sound like a far-fetched concept, but this is a technically viable solution using components currently available: OpsMgr 2007 combined with the SCVMM management pack, using Virtual Server R2. Figure 20.17 shows the SCVMM management pack in OpsMgr, with a guest operating system (Honor). Available tasks for the guest operating system include Create Checkpoint, Pause, Save State, Shutdown, Start, and Stop. As we discussed in the “Console Tasks” section of this chapter, we can integrate console tasks into diagnostics and recoveries, which in turn activate based on conditions found in the environment.

Figure 20.17. The System Center Virtual Machine Manager management pack.

Using the Virtual Machine Manager management pack shows how OpsMgr can automatically adapt to changes in the environment based on application performance, and that it can horizontally scale up and down based on the requirements of the application.

Scaling Up

Vertical scaling can also occur with OpsMgr 2007, but capabilities are currently limited. If OpsMgr identifies a bottleneck condition in a guest operating system running within Virtual Server 2005 R2, the system can be shut down at a scheduled interval, additional memory or disk resources can be added, and the system then be brought back online. These same functions can be performed using the VMWare product line (which also requires the system be shut down to add resources). We will discuss updates to the Microsoft virtualization functionality at the end of this chapter.

Provisioning Techniques

OpsMgr can also automate the process of provisioning servers. You can provision server resources automatically, either through SMS/ConfigMgr using Operating System Deployment (OSD) or through SCVMM. With a pool of available servers, OpsMgr could automatically provision as required, using the OSD functionality. This, however, would represent a large investment in hardware that would not be efficiently used until it was required.

The usage of virtualization technologies to provision servers provides a method that dramatically increases the efficiency of resource utilization. The SCVMM interface provides self-service server-provisioning functionality that can be integrated with OpsMgr. The same processes used to provide the functionality within the self-service server provisioning could be automated based on conditions identified by OpsMgr.

For the Odyssey Corporation, each remote location has a server that provides Virtual Server 2005 R2 to use for various server functions. By monitoring Active Directory, Odyssey can determine whether user logon times are within an acceptable range. OpsMgr can provision the server in the remote location to provide domain controller functionality and can automate the process of performing a dcpromo of the system into a domain controller. This shows an extreme example of how one can automate the provisioning of servers to address issues identified by OpsMgr. Most organizations would not want to automate the process of promoting a domain controller (it could be disconcerting to come to work and have a few new DCs online), so this could be done through different tasks that would first provision the server and then perform a dcpromo of the server.

OpsMgr can also use the capabilities of SCVMM to determine which servers are good candidates for virtualization, and provide tasks to automate the process to convert the system from a physical server to a virtual server. Providing the capability to consolidate physical servers onto virtual servers decreases the hardware requirements for an organization and increases the usage of the server resources that are available.

In the “Capacity Planner” section, we discussed how to integrate SCCP with OpsMgr to identify potential bottlenecks and configuration changes, thus addressing them proactively. Applying these capabilities to virtualize servers, automatically provision servers, and proactively plan for potential hardware changes together opens up a completely new level of concepts to consider.

As an example, we will discuss an Exchange 2007 environment with two mailbox servers and one Client Access Server. The SCVMM recommended virtualizing the Client Access Server, so it was changed from a physical to a virtual system. If the Capacity Planner identified a potential bottleneck on the Client Access Server, an additional server could be provisioned, and the new Client Access Server can now share the load with the original one. This is an example of how these various technologies could all work together to provide an environment that continues to evolve and adapt to changes in the environment around it.

Servers provide functionality to an organization and normally have a life cycle. New servers are typically added to provide new functionality as required. Applications and functions are replaced over time, but often the server providing the functionality is not identified for removal or reprovisioning. OpsMgr can call scripts to identify whether or not the system is actively being used. If a system is identified as not in use for a long enough period of time, the server can be deprovisioned and then can either be removed (if it is obsolete) or reprovisioned for usage elsewhere in the organization.

When OpsMgr is combined with other technologies, there is potential to provide both increased scalability and performance for the applications used in an organization.

Virtualization: Windows Server 2008 and Beyond

During the early beta versions of Windows Server 2008, functionality was available that allowed for the addition and removal of resources to the operating system while the system was running. Although this functionality was removed from the product, it indicates a direction where Microsoft may be moving beyond the Windows Server 2008 timeframe. Microsoft is also introducing new virtualization technologies within Windows Server 2008 that will increase the scalability of guest operating systems so that they can use more than one processor and more than 3.6GB of memory. It is not known at this time whether this functionality will be included with the release version of Windows Server 2008 or with the hypervisor technology currently scheduled for release approximately 6 months after Windows Server 2008 is released.

These enhancements have significant impacts on the ability to automatically adapt an environment. Increasing the amount of resources available within a guest operating system and adding the capability to add and remove resources while the operating system is running allow us to greatly increase the ability to vertically scale our servers.

As a last example, let’s take an application that cannot be configured to horizontally scale. Due to limits in the application, it can only run on a single server, but it is a multithreaded application. OpsMgr detects a bottleneck on the application and the guest operating system. OpsMgr adds resources to the guest operating system to address the bottleneck. Once the resources are no longer required, they are removed and added back to the pool of available resources. This floating resource pool is available to provide horizontal scaling of each of the business-critical applications in the environment.

Summary

Although Operations Manager 2007 provides for solid monitoring and notification of issues in an environment, this is not the limit of its capabilities. OpsMgr’s ability to automatically adapt can be summarized as follows: Can you identify it? Can you script it? If so, you can do it with OpsMgr.

This chapter not only explained how OpsMgr can adapt, but also discussed some of the capabilities you have with OpsMgr. With Microsoft’s creation of the PowerShell functionality and its integration into products, we expect those things we can automate today to be only the tip of the iceberg.

In the next chapter, we will discuss how you can integrate Operations Manager 2007 and System Center Essentials using Microsoft’s Remote Operations Manager, thus increasing the functionality of both applications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 20. Automatically Adapting Your Environment

Create new playlist

Sign In

Sign Up

Chapter 20. Automatically Adapting Your Environment

Operations Manager Functionality

Diagnostics and Recoveries

Tip: Diagnostics and Recoveries Online

Creating Diagnostic Tasks

Creating Recovery Tasks

Notification

Computer Groups

Note: Custom Computer Group Options for the Windows Computer

Tip: Populate Dynamic Computer Groups with Health Service Watchers

Console Tasks

Enhancing Diagnostics and Recoveries

Identifying Conditions

Tip: On the CD

Operations Manager Integration

Configuration Manager

Using Collections

Tip: On the CD

Integrating with Agentless Exception Monitoring

Extending Functionality with Scripting

Capacity Planner

Virtual Machine Manager and Virtual Server 2005 R2

Tip: SCVMM Scripting Guide

PowerShell/Command Shell

Tip: PowerShell cmdlets and Additional Functionality

Service Manager

SharePoint

Note: MOM 2005 SP 1 and the SharePoint WebPart

Exchange 2007

Active Directory

VMWare MP

Custom Management Packs

Automatically Adapting with Operations Manager

Maintaining Systems

Disk Defragmentation

Patch Management

Disk Backups

Tip: Backing Up Running VMs

Antivirus

Additional Maintenance Functions

Distributed Application Provisioning

The Operations Manager Management Group

Custom Distributed Applications

Workflow and Application Provisioning

User and Computer Provisioning

Tip: On the CD

Security Adaptation

Server Provisioning

Scaling Out

Scaling Up

Provisioning Techniques

Virtualization: Windows Server 2008 and Beyond

Summary

Table of Contents for
20. Automatically Adapting Your Environment