NetScaler Gateway™ VPNs

To get a baseline idea of what a successful login and resource access should look like, over the next few pages we will examine the various stages of a NetScaler Gateway VPN session using a Wireshark capture. The intent is to provide you with the knowledge of a known good trace that you can compare against when troubleshooting issues.

We will then follow up with a discussion of the troubleshooting tools and techniques for troubleshooting NetScaler Gateway VPNs.

Examining VPN session launch using Wireshark

VPN session establishment is a multi-step process where the client and NetScaler exchange a number of control messages. To make this exchange easier to digest, let's break this into different phases:

  • Phase 1: The EPA exchange
  • Phase 2: The authentication exchange
  • Phase 3: Post login exchange

Note

To avoid duplication, we will assume the SSL handshake was successful. SSL handshake troubleshooting would be exactly the same as covered in the SSL section of Chapter 2, Traffic Management Features.

Phase 1 – The EPA exchange

Pre-authentication, if configured, will be the first step of the exchange:

Phase 1 – The EPA exchange
  1. The client tries to load the VPN login page and, since pre-authentication is configured, gets redirected to /epatype page (Packet 275).
  2. The Client visits /epatype and learns what the settings for EPA and device certificate check are. In our example, EPA was enabled and device certificate check is off, which is reflected in NetScaler's response:
    Phase 1 – The EPA exchange
  3. Now that the client knows EPA is configured, it needs to find out what those EPA checks to be carried out are. To do this, it sends a GET to /epaq.
  4. In our test, we are doing a check for domain membership. Hence, the client sees the following text in the response:
    Phase 1 – The EPA exchange
  5. The EPA plugin on the client runs this check and returns a CSEC value that represents the result. In our case, that check passes since the machine is a domain member. So, the GET /epas contains 0, which indicates success. In the troubleshooting section, we will talk about how to interpret this value.
    Phase 1 – The EPA exchange
  6. The client then receives the Login page from NetScaler.

Phase 2 – The authentication exchange

The User provides their credentials. As a result, authentication and group extraction happen.

Note

By default, it is the NSIP that gets used for communication with the authentication server. However, using a Netprofile, you can force this to be a specific SNIP to suit your firewall rules.

  1. The User authentication happens as a POST request with the credentials.
    Phase 2 – The authentication exchange
  2. The NetScaler first uses its credentials to authenticate itself, in order to be able to talk to the LDAP server (packet 409).
  3. NetScaler then sends the User provided credentials to the server (packets 414 and 418) for authentication and group extraction.

    Note

    Notice how packet 509 shows the path as /cgi/setclient?agnt. The /cgi/setclient path is what helps NetScaler identify the client device so it can handle the VPN request appropriately. agnt indicates a VPN plugin. For a clientless VPN, this would be /cgi/setclient?cvpn.

Phase 3 – Post-login exchange

The following successful authentication, depending on whether client choices are configured in the session profile, the NetScaler Gateway presents the User with a list of options. The possibilities here are:

  • FULL VPN: Layer 3 Intranet connectivity
  • Clientless VPN: Web Access (HTTP and HTTPS) and ICA access
  • ICA PROXY: ICA only access

    Note

    If ICA Proxy is set to ON, the client choices will not be displayed and the User will go directly to the Storefront page. Sometimes users might report seeing Error: Logins Exceeded on successful authentication. This might happen for one of three reasons:

    • There are not enough licenses
    • The global AAA User limit hasn't been raised from the default of 5
    • The MaxUsers setting on the VPN vServer is being hit

Let's now look at a trace from a scenario in which the User chooses FULL VPN:

  1. The client is sent to a plugin Detection and Download page.
  2. Clicking on the Download link starts the plugin download and installation.
  3. Once the installation is complete, there will be a number of HTTPs exchanges between the client and the VPN vServer to get the connection established.
    Phase 3 – Post-login exchange
  4. Following is description of the requests:
    • /cfg requests are configuration download requests from the VPN client.
    • /cs requests are connection setup messages.
    • /dns requests are DNS requests. By default, they are exchanged as HTTP and get converted to a regular DNS protocol in NetScaler, before being sent to the DNS server.
  5. At the end of this connection setup, the User has full layer 3 connectivity for the company network and can start accessing resources.
  6. At the end of the session, once the User clicks on Logout, a request is sent to /cgi/logout, redirecting the User to the post-logout page. If configured, a clean-up script will be triggered at this point.
  7. If the User chooses CLIENTLESS ACCESS instead, the /cgi/setclient path will be set to cvpn. In that case, you will not see any control messages (/cfg, /dns).
    Phase 3 – Post-login exchange

Instead a /cvpn/ path will be added to the path the original request will be either shown as is, base64 encoded or encrypted based on the Clientless Access URL Encoding setting.

Troubleshooting NetScaler Gateway™ VPNs

There are a number of tools and techniques available to troubleshoot the VPN feature. We will explore these in the following order:

  • Debug logs from the client's PC
  • The aaad.debug log file for authentication issues
  • The ns.log on NetScaler for session information
  • The pol_hits nsconmsg counter to verify which policies are getting hit
  • The active User sessions GUI tool
  • Capturing traces

Collecting debug logs from the client's PC

These logs contain a wealth of information across several files on the client's PC. In order to capture the maximum detail, you need to enable debug. This can be enabled in two ways:

  1. Push this setting from the NetScaler Gateway to the VPN plugin on the client machine, using the Client Debug option under Session Profile | Client Experience | Advanced Settings. The User will need to log out and log back in so that the change takes effect.
    Collecting debug logs from the client's PC
  2. Have the User select the Record detailed debugging messages option. This is found by right-clicking on the VPN plugin in the system tray and going to the Trace tab in the options:
    Collecting debug logs from the client's PC

Once the issue has been reproduced, you can ask the User to run the nsClientCollect.exe script, which will create a ZIP file containing all the necessary logs so they can be easily shared with you. Here is a sample run of the command:

Collecting debug logs from the client's PC

Diagnosing EPA failures

Let's troubleshoot an example EPA failure.

The issue that the User reports that he cannot log in and sees an error that the client machine doesn't meet the security requirements:

Diagnosing EPA failures

From the details in the error it's clear that it's EPA and not authentication that is failing. To see the reason for this failure, the file nsepa.txt picked up by the nsclientcollect utility would be the best resource for identifying the problem. Open this file and look for a header called CSEC. This field contains values – usually 0 or 3 – for each of the checks:

  • 0 indicates a success
  • 3 indicates a failure

Now let's consider the following screenshot:

Diagnosing EPA failures

Here, the value 03 means that there are two checks (since there are two digits) and that the first check succeeded (0) but the second failed (3). So you need to look at the EPA policy to identify what the second expression is, and then match it to the User's situation to see if it's the User's machine or the expression that needs to be addressed. As well as nsepa.txt, a decrypted trace would also show this information.

Note

The check for domain joined is a popular one; you can set it up in this way:

add aaa preauthenticationaction allow_xmx.lab_machines 
add aaa preauthenticationpolicy is_domain_xmx.lab q/CLIENT.REG('HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Services\\Tcpip\\Parameters_Domain').VALUE == xmx.lab/ allow_xmx.lab_machines

Using aaad.debug for authentication issues

The file aaad.debug, which we briefly visited in the AAA chapter, is the one you would look at for authentication issues with VPN as well. aaad.debug is especially valuable when using multiple authentication policies, as it allows you to see which of the several authentication policies is failing before you engage in more specific troubleshooting.

Another indispensable aaad.debug feature is the ability to display User group memberships. This is really helpful for identifying situations where incorrect group association is the reason why a User doesn't see the expected resources.

The following screenshot is an example of running cat /tmp/aaad.debug, showing which groups the User is part of:

Using aaad.debug for authentication issues

One aspect of group extraction that has been a challenge for a long time is ensuring the right group is picked up when the User is part of more than one group. NetScaler by default picks up whichever group is returned first, which might not necessarily be the one you are looking for. In other words, a priority is missing.

A solution has been included, starting with version 10.1, in the form of the parameter defaultauthenticationgroup. When this parameter is set, upon successful authentication, NetScaler assumes the User to be part of this group and applies the policies bound to this group.

Using ns.log to see authorization and session information

The file /var/log/ns.log should be a familiar one to you by now as we have relied on it for troubleshooting several other feature issues. It is especially useful in a NetScaler Gateway context, since the logs for this feature are captured in a very detailed fashion. Let's explore its usefulness by trying to troubleshoot another example issue.

The issue is that a User passes EPA and authentication successfully, but instead of seeing a homepage, experiences a browser hang followed by a timeout:

Using ns.log to see authorization and session information

Upon running a tail –f on the ns.log file (tail –f /var/log/ns.log) and having the User access the page at the same time, it becomes evident that it's the session policy 192.168.1.55_443_pol that is denying access:

Using ns.log to see authorization and session information

At this point, you will need to look at the settings in the Security tab of the session policy to ensure that either the default authorization setting is adjusted, or that an appropriate authorization policy is used.

ns.log also captures a ton of other information for VPN issues:

  • Timestamps in GMT and local time zones
  • The username
  • The session ID
  • The client IP and port
  • Session start and end time
  • Whether it was mapped IP/SNIP or an Intranet IP (IIP) that was used
  • The VPN vServer IP
  • Destination IP and port (server)
  • Any error messages
  • The policy that kicked in
  • The group that the User was being considered as part for policy evaluation
  • Total TCP connections, UDP flows

Using the pol_hits counter to examine policy hits

When users log in to NetScaler Gateway VPN, who gets access to what resources and on satisfying what conditions is governed by a combination of policies and profiles. Issues such as users seeing resources or options they aren't meant to see can happen due to the inheritance behavior of NetScaler Gateway policies.

Therefore, it is important to understand how inheritance works. NetScaler Gateway policies, be they pre-auth, session, or traffic policies, follow this processing order:

User level > Group level > VSERVER level > Global

In addition, certain policies, such as pre-auth, can only be bound at the global or vServer level since the User has not yet presented their username and cannot be identified as a result.

The pol_hits nsconmsg counter is a very useful means of identifying what the resultant set of policies is. Taking an example, in the following screenshot we can see that for the User who just logged in, the LDAP authentication policy (ldap_LDAP_pol), the global session policy (SETVPNPARAMS_POL), and a more specific session policy (192.168.1.55_443_POL) are being hit:

Using the pol_hits counter to examine policy hits

Note

In a busy environment, this command can present a lot of information in a short time. Furthermore, you might also have a challenge in being able to tell which User the output is for, when several users are logging in. For this reason, it would be best to use the command during a window when you are able to limit the users logging in, such as after hours.

Seeing and managing the users who are logged in

When troubleshooting you will often times need to make a configuration change to vServer or policies. This introduces a need to be able to:

  • Find out if and how many users are logged in
  • Get users to re-process policies by having them log out and log back in

The Active User sessions tool in the NetScaler Gateway tab is great for this purpose:

Seeing and managing the users who are logged in

Capturing traces for troubleshooting

Considering that NetScaler Gateway exchanges are basically SSL conversations, it's easy to see why Wireshark is a critical tool for troubleshooting. All troubleshooting that we looked at in the SSL Chapter automatically applies here, including:

  • SSL Handshake failures due to certificates not being trusted
  • TLS versions not matching
  • No matching Ciphers between receiver and NetScaler
  • Ports being blocked

The following points are important to remember while taking traces:

  1. Where possible always identify a test User and note down the username so logs and traces can be correlated.
  2. Disable SSL reuse on the VPN vServer so that the trace can be decrypted.
    > set ssl vserver vpn.xmx.lab -sessReuse DISABLED Done
    

    Note

    Notice that in the preceding example, I used set ssl vserver and not set vpn Vserver. This is how you set any SSL-related settings, even if the vServer is a VPN vServer.

  3. Set the trace size to 0. If you leave it at the default 164-byte truncated size, the complete certificate will not be captured, so the decryption will fail.
  4. Have the test User close their session and only log in after you have started the trace.
  5. Always take simultaneous traces between client and NetScaler and the backend server for backend access issues.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset