Chapter 10. Troubleshooting Network Issues

No network is perfect. Regardless of how well we plan and implement our infrastructure, problems can and will happen. The most important skill you will need in order to be successful as a network administrator is your ability to troubleshoot issues. When problems occur, your ability to think rationally and narrow down the issue by the process of elimination will carry you through. While it can certainly be stressful when things go haywire, network administrators enjoy the job security. In this chapter, we'll work through troubleshooting some common issues that may come up in Linux networks. In the final chapter of our journey, we will cover:

  • Tracing routing issues
  • Troubleshooting DHCP issues
  • Troubleshooting DNS issues
  • Displaying connection statistics with netstat
  • Scanning your network with nmap and Zenmap
  • Installing missing firmware on Debian systems
  • Troubleshooting issues with Network Manager

Tracing routing issues

The entire purpose of a network is to get data from point A to point B. If for some reason we aren't able to get data where we need it, it can sometimes be a pain to pinpoint exactly where the issue manifests itself. But through the process of elimination, pinpointing where routing issues manifest themselves shouldn't be too difficult.

Whenever I experience issues with a node being unable to communicate to a specific server or network, I like to work my way from their workstation back to the switch stack until I find the issue. To start, I check the obvious things, such as what the IP address is (or if the machine even has one) and then I also check the routing table. If the problem is intermittent, you would likely want to test the cable. For some reason, I've come across quite a few instances where a problem resulted from a bad cable. I don't know why, but it seems that other administrators I know, don't have this luck. But it never hurts to run a cable tester on the network cable to check, just in case.

Assuming that you've already tried the easy stuff, next you would want to determine whether or not you can reach the default gateway. If you know the IP address of your local default gateway, simply ping it to see if you can reach it, and note the result. Does your attempt time out, or does it get through just fine? If you don't know the IP address of your gateway, run route -n in your terminal emulator to find out. If you can reach your default gateway by IP, try to reach it by hostname as well as the IP address of the target node you were trying to connect to in the first place. If you're able to reach resources by IP and not their hostname, this would most likely be a DNS issue. We'll talk about troubleshooting DNS later in this chapter. But for now, determining whether or not you can reach your DNS server and/or gateway would be good first steps. If you can't, you may have a resource that is down, and a line of angry co-workers waiting for you back at your desk.

If the problem is intermittent, we can start our troubleshooting by interrogating the local machine. The ip address show command will give us some details about the IP address of the local machine. We can actually shorten this command by abbreviating it to ip addr show, or if you really don't like typing, you can simplify it down further to just ip a. The following shows the output of ip addr show from an example system:

Tracing routing issues

Investigating the IP address on a local machine

At this point in the book, there shouldn't be anything too surprising about the output of ip a. However, the output from my machine may look unique to what you may see in the wild, so it's worth going over. First, you can see that the Debian machine I used for testing has five network interfaces on it. The first is the local loopback adapter, lo; and the second is eth0. Since this machine is currently using Wi-Fi, it's no surprise that eth0 doesn't have an IP address. The next interface, wlan0, has an IP address of 192.168.1.106. The last two interfaces are unique; they exist as bridges for Docker and KVM virtualization to be able to perform their own networking. Even though Docker and KVM aren't within the scope of this book, I bring up the fact that they do their own networking because when one of these services is installed, you may see your Linux desktop environment report that you are connected to a network, even when technically you aren't. On my machine, if I disconnect wlan0, it would still show that I'm connected. This is because the GUI version of Network Manager that most graphical distributions ship with, does a terrible job of reporting an accurate status in regards to your connectivity, and this could confuse the situation.

Now that you've determined that the machine has an IP address, another step you can take is to use the traceroute command. Those of you that have used Windows, may be familiar with this concept already, as the Windows utility tracert works pretty much the same way. The traceroute utility is not always installed by default when you set up a Linux distribution, so you may need to install the traceroute package. From here, you should be able to use traceroute along with the hostname or IP of a resource, to see where the process drops out. You can also use traceroute against the URL of a website, if the issue is that your workstation isn't able to access the public Internet. In the following screenshot, a traceroute against google.com is shown:

Tracing routing issues

Running traceroute to troubleshoot accessing the public Internet

In the previous screenshot, I ran a traceroute to www.google.com. From the output, we can tell several things right away. First, we can see that the first hop our command tries to reach is a device called m0n0wall.local at an IP address of 192.168.1.1. If I run route -n, I see that this is the default gateway of the network I'm currently using. The m0n0wall is a firewall distribution of FreeBSD, which is in use on this network. I discovered this when I ran the command. Next, we can see that we made it through the m0n0wall device to another private network of 172.21.0.1 and then 198.111.175.120, but output stops when my request reaches 198.108.22.150. After that, we just see asterisks, but we're not going beyond that. In a hypothetical example of my machine not being able to access the Internet, I may want to investigate the device at 198.108.22.150 and find out why it's not letting my traffic through. However, in my case this device is dropping ICMP packets, which is causing the traceroute command itself to fail.

One of the things you would definitely want to check when troubleshooting routing issues is your routing table. We covered routing in Chapter 8, Understanding Advanced Networking Concepts, and the routing table as well as adding routes was covered. But as a refresher, you can use route -n to print the routing table onto your shell. If the machine you're troubleshooting doesn't have a route to the network it needs to access, then the root-cause is easily apparent. You would then need to add a default gateway in order to allow the machine to reach that network.

Tracing routing issues

Viewing the local routing table

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset