Troubleshooting

In troubleshooting multi-WAN setups, we can group the problems we are likely to encounter into three broad categories:

  • There may be a connectivity issue with either your primary or secondary connection.
  • There may be a misconfiguration in your setup.
  • pfSense thinks a gateway is up when it is actually down, or vice-versa.

The first point seems rather obvious, but you'll want to make sure that both connections are working independent of the gateway group. If the second connection accounts for a low percentage of our overall bandwidth (for example, our primary connection is broadband and our secondary connection is DSL), or is part of a failover group, then we might not notice it is not working until the primary connection fails. Ensuring that both connections work when we first set things up will potentially save us some grief; we wouldn't want to wait until our primary connection fails to find out that our secondary connection wasn't working either. If we are troubleshooting, ensuring that both connections work could save us a significant amount of time.

Before you declare your multi-WAN setup to be functional, you will want to perform a few tests. There are several ways you can simulate a WAN interface going down. In the first case (and for any type of connection), you should disconnect the cable between pfSense and your primary connection and confirm that you are still connected to the internet. Be sure to confirm that DNS resolution is also still working. Then reconnect the primary connection and disconnect the secondary connection, and perform the same test.

If you have a cable, DSL, or other connection with a modem, you should try unplugging the modem and unplugging the connection between the modem and the demarcation point (usually a coax or phone jack) and see what happens. For a T1 connection, unplug the internet connection from the router, and either unplug or power off the router.  If you have more than two WAN interfaces in your setup, you may want to perform this test with different combinations of interfaces being down and up. As long as one interface in the gateway group is up, you should have some level of internet connectivity.

Such testing is valuable because sometimes you will uncover a configuration error that you would not have uncovered otherwise. For example, the OPT_WAN interface could use as its monitor IP the public IP address of the WAN connection. When you power off the modem connected to the WAN interface, OPT_WAN goes down, even though the secondary connection is solid and OPT_WAN should be up. 

Misconfiguration is a likely issue; creating a gateway group is relatively easy, but there are many steps needed to make everything work as it should. Primarily, we need to remember that there must be NAT rules for each new OPT_WAN interface, and there must be firewall rules to direct traffic to the appropriate gateway group (or to a particular gateway, if desired). You will especially want to check the firewall rules. By default, the rules direct traffic to the default gateway, so when the primary WAN interface is up, everything might seem to be working. Then when the primary WAN interface goes down, internet connectivity will be lost even though there are still functioning interfaces in the gateway group. Double-check the NAT and firewall rules to make sure they are behaving as they should.

If the flow of traffic within your multi-WAN setup is working as it should, but DNS resolution does not work, the issue may be the fact that pfSense only uses its internal routing tables for traffic flowing in and out of interfaces. It does not apply them to internal pfSense traffic. Therefore, you may have to configure a static route for the OPT_WAN DNS server. 

Sometimes there is a failure with gateway load balancing where there is a gateway group and one of the WAN connections no longer has internet connectivity, but still remains active. This may be because the monitor IP is still responding (for example, if you set it to an IP on your internal network), so pfSense thinks the connection is still good. If so, ensure that the monitor IP is correct.

Another issue with monitor IPs is the case where an external site is used for the monitor IP. The network administrator at the external site sees pfSense's pings to the site, and suspects this could be the beginning of a denial-of-service attack. The admin then blocks your monitor pings, and because the monitor pings fail, the gateway goes down, even though the network interface is still functioning and has internet connectivity. This potentially could happen whenever you set a monitor IP to a site you don't control. The solution, of course, is to use an external site you control for the monitor IP. In most cases, using the DNS server as your monitor IP should not result in your pings being blocked.

When you connect to certain websites, such as e-commerce sites, they will store session information, including your public IP address. If you subsequently connect to such a site through a different public IP address, the site might not function as it should. The Use Sticky Connections option is designed to eliminate this problem by directing traffic from such websites to the same WAN address so long as there are states that refer to the connection. 

If your multi-WAN setup implements load balancing, you should verify that the load balancing works. One of the ways to do this is to visit websites that tell you what your public IP address is (just do a web search for "what is my IP address"), and keep refreshing the page. If you refresh the page multiple times, you should see your public IP address change. You may have to reload the page several times, however, since there may be other traffic on your network; in addition, you may have set the weights on your gateway group to heavily favor one interface above all others. Eventually, however, you should see your IP address change.

If your IP address never changes, make sure you are really reloading the page and not accessing the page from a cache (as may be the case if you are using a web proxy). Also, make sure that you don't have sticky connections enabled or some other option that enables persistent connections. Deleting your web cache, visiting different "what is my IP"-type sites, and trying different web browsers are good options to try before further troubleshooting load balancing.

Another way to test load balancing is to use the traceroute command. We will cover this command in greater detail in Chapter 12, Troubleshooting pfSense, but for now, you should know that it is a command that displays the route a packet takes to a site, while also displaying the transit delay at each step. It is available in Windows, Linux/BSD, and MacOS. 

Finally, another possible issue is that your the default packet loss and latency settings are generating false positives (the opposite could also be true, and your connection could be down without pfSense detecting it, though this is less likely). In such cases, you need to navigate to System | Routing, and on the Gateways tab, edit the Advanced Settings for the gateway that is generating false positives. Increasing the Time Period over which results are averaged may help.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset