CPU and memory issues

Most high CPU and memory issues will require working with tech support to conclude the root cause through adequate code-level analysis. There are, however, actions you can take, including collecting useful information, both to speed up the investigation, but also to avoid having to wait for the issue to recur for that information to be captured.

Types of NetScaler CPU

NetScaler has two CPUs that do very different things:

  • The Management CPU handles mainly bookkeeping tasks and parts of the NetScaler code that run in FreeBSD, such as the various protocol daemons (for example, snmpd). A high management CPU usage, unless prolonged, does not impact packet handling, and a momentary spike should be expected when logs are compressed as part of a rollover.
  • The Packet Engine CPU is entirely dedicated to handling packets, therefore, a saturation of this CPU can impact your production traffic and needs to be dealt with immediately.

SNMP is the best way to detect high CPU events as it is not practical to constantly monitor the dashboard. NetScaler has specific traps that get sent out when a CPU goes high. You can configure these values by navigating to System | SNMP | Alarms. The following is a typical example:

Types of NetScaler CPU

Consider the following steps when you see CPU staying pegged at 100%:

  1. Use stat cpu on the NetScaler CLI to see what the actual packet engine CPU consumption is. If it shows near 100%, try the following steps to lower potential impact to traffic:
    1. Stop any running traces.
    2. Disable USIP if not absolutely needed.
    3. Where possible, turn off rewrite policies that use regex.
    4. Ensure SSL reuse is enabled on all SSL vServers. It is by default, but as you recall from the Chapter 2, Traffic Management Features, it will need to be turned off momentarily to obtain a decryptable trace. If, however, left for long in production, a CPU increase can be expected along with SSL card utilization.

    If this is a VPX or SDX, also consider adding additional packet engines. Citrix article CTX139485 shows how to do this for a VPX.

  2. If it's the Management CPU that is shooting to 100%, run the shell command top and look for processes other than NSPPE that are taking up high CPU percentage – this can be because of any of the daemons that run in userland nsaaad and httpd. Save this output to a file (for example, top > /var/top.txt) to include with the case information when engaging Citrix Tech Support.

    Note

    When you check the top output, you will see the NSPPE (NetScaler Packet Processing Engine) processes taking up 100%. This is normal, since FreeBSD on the NetScaler offloads all of the CPUs except one (the Management CPU) to the NetScaler OS.

  3. Generate a show techsupport file and share it with Citrix Tech Support to assist with the root cause analysis. The easiest way to do this is from the GUI, under the diagnostics tab.

Exploring high memory issues

Memory build ups happen more gradually than CPU build ups. As a result, apart from SNMP monitoring, periodically looking at the dashboard or running stat commands on the NetScaler is a good way to catch them.

Memory build ups can result from:

  • High traffic
  • Memory leaks
  • The use of certain features

Troubleshooting high memory issues

To troubleshoot memory issues, start by plotting the memory usage versus traffic being handled. The easiest way to do this is to use the CPU versus Memory versus HTTP Requests Rate graph. You will find this graph in the dashboard:

  • If the graph instead only shows memory increasing and never dropping, even after peak hours, this could be due to a memory leak resulting from a function not releasing memory it no longer needs. Memory leaks are bugs that need engaging with technical support with the help of a techsupport file. It will be useful to note any details of features recently enabled or new services being created on the NetScaler for faster identification.
  • If the memory usage is high but it increases and decreases with traffic, it means you have an insufficiently sized NetScaler. You might, in the short term, handle the situation by reducing the amount of memory assigned to caching and TCP buffering, or by turning off certain protections in AppFirewall. Another option that you have with AppFirewall is to use the sessionless forms of protection (for example, sessionless form field consistency).
  • The shell command nsconmsg -s ConMEM=2 -d oldconmsg | more produces a snapshot of the current memory consumption, giving you an insight into the amount of memory that each of the features is consuming. This will help you understand if your NetScaler is undersized for the traffic it needs to handle, or if particular application is receiving more traffic than you planned for:
    Troubleshooting high memory issues

    Note

    Memory issues can also manifest due to failed memory hardware. Since memory is detected at boot time, dmesg is a great place to find this info. Use the shell command dmesg | grep memory. If the real memory is less than what is advertised when you purchased the unit, you could be looking at an RMA. A quick way to verify what it should be is by looking at the HA peer, since the units are generally both the same model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset